Tengyu MA

Cited by

	All	Since 2019
Citations	20582	18671
h-index	60	58
i10-index	101	101

6000

3000

1500

4500

201520162017201820192020202120222023202477 218 518 918 1331 1722 2404 3522 5564 4096

Public access

View all

38 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Colin WeiStanford UniversityVerified email at stanford.edu
Sanjeev AroraProfessor of Computer Science, Princeton UniversityVerified email at cs.princeton.edu
Rong GeDuke UniversityVerified email at cs.duke.edu
Percy LiangAssociate Professor of Computer Science, Stanford UniversityVerified email at cs.stanford.edu
Yuanzhi LiAssistant Professor at CMUVerified email at andrew.cmu.edu
Jason D. LeeAssociate Professor of Electrical Engineering and Computer Science, Princeton UniversityVerified email at princeton.edu
Ananya KumarResearch Scientist, OpenAIVerified email at cs.stanford.edu
Sang Michael XiePhD Candidate, Stanford UniversityVerified email at cs.stanford.edu
Denny ZhouResearch Scientist, Google DeepMindVerified email at google.com
Andrej RisteskiCarnegie Mellon UniversityVerified email at andrew.cmu.edu
Adrien GaidonAdjunct Professor, StanfordVerified email at stanford.edu
Haochen ZhangPhD Student, Stanford UniversityVerified email at stanford.edu
Aditi RaghunathanAssistant professor, Carnegie Mellon UniversityVerified email at cmu.edu
Yining ChenOpenAIVerified email at openai.com
Kaidi CaoStanford UniversityVerified email at cs.stanford.edu
Garrett ThomasStanford UniversityVerified email at stanford.edu
Yuping LuoComputer Science Department, Princeton UniversityVerified email at cs.princeton.edu
Elad HazanProfessor at Princeton University and Director Google AI PrincetonVerified email at princeton.edu
Zhiyuan LiAssistant Professor, Toyota Technological Institute at ChicagoVerified email at ttic.edu
Nikos ArechigaToyota Research InstituteVerified email at tri.global

Tengyu MA

Stanford University

Verified email at stanford.edu - Homepage

Machine Learning Deep Learning Theory Machine Learning Theory


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Linguistic Calibration of Long-Form Generations N Band, X Li, T Ma, T Hashimoto Forty-first International Conference on Machine Learning, 2024		2024
Linguistic Calibration of Language Models N Band, X Li, T Ma, T Hashimoto arXiv preprint arXiv:2404.00474, 2024	1	2024
Chain of thought empowers transformers to solve inherently serial problems Z Li, H Liu, D Zhou, T Ma arXiv preprint arXiv:2402.12875, 2024	7	2024
What is the Inductive Bias of Flatness Regularization? A Study of Deep Matrix Factorization Models K Gatmiry, Z Li, T Ma, S Reddi, S Jegelka, CY Chuang Advances in Neural Information Processing Systems 36, 2024		2024
Sharpness minimization algorithms do not only minimize sharpness to achieve better generalization K Wen, Z Li, T Ma Advances in Neural Information Processing Systems 36, 2024	16	2024
Beyond ntk with vanilla gradient descent: A mean-field analysis of neural networks with polynomial width, samples, and time A Mahankali, H Zhang, K Dong, M Glasgow, T Ma Advances in Neural Information Processing Systems 36, 2024	8	2024
Doremi: Optimizing data mixtures speeds up language model pretraining SM Xie, H Pham, X Dong, N Du, H Liu, Y Lu, PS Liang, QV Le, T Ma, ... Advances in Neural Information Processing Systems 36, 2024	65	2024
Data selection for language models via importance resampling SM Xie, S Santurkar, T Ma, PS Liang Advances in Neural Information Processing Systems 36, 34201-34227, 2023	73	2023
Provable guarantees for self-supervised deep learning with spectral contrastive loss JZ Haochen, WEI Colin, AD Gaidon, MA Tengyu US Patent App. 17/714,848, 2023		2023
Toward L_∞ Recovery of Nonlinear Functions: A Polynomial Sample Complexity Bound for Gaussian Random Fields K Dong, T Ma The Thirty Sixth Annual Conference on Learning Theory, 2877-2918, 2023	2	2023
One step of gradient descent is provably the optimal in-context learner with one layer of linear self-attention A Mahankali, TB Hashimoto, T Ma arXiv preprint arXiv:2307.03576, 2023	47	2023
Same pre-training loss, better downstream: Implicit bias matters for language models H Liu, SM Xie, Z Li, T Ma International Conference on Machine Learning, 22188-22214, 2023	28	2023
The inductive bias of flatness regularization for deep matrix factorization K Gatmiry, Z Li, CY Chuang, S Reddi, T Ma, S Jegelka arXiv preprint arXiv:2306.13239, 2023	6	2023
Large language models as tool makers T Cai, X Wang, T Ma, X Chen, D Zhou arXiv preprint arXiv:2305.17126, 2023	98	2023
Sophia: A scalable stochastic second-order optimizer for language model pre-training H Liu, Z Li, D Hall, P Liang, T Ma arXiv preprint arXiv:2305.14342, 2023	72	2023
Symbol tuning improves in-context learning in language models J Wei, L Hou, A Lampinen, X Chen, D Huang, Y Tay, X Chen, Y Lu, ... arXiv preprint arXiv:2305.08298, 2023	42	2023
Larger language models do in-context learning differently J Wei, J Wei, Y Tay, D Tran, A Webson, Y Lu, X Chen, H Liu, D Huang, ... arXiv preprint arXiv:2303.03846, 2023	189	2023
How Sharpness-Aware Minimization Minimizes Sharpness? K Wen, T Ma, Z Li The Eleventh International Conference on Learning Representations, 2023	34	2023
On the opportunities and risks of foundation models. arXiv 2021 R Bommasani, DA Hudson, E Adeli, R Altman, S Arora, S von Arx, ... arXiv preprint arXiv:2108.07258, 2023	67	2023
Larger language models do in-context learning differently, 2023 J Wei, J Wei, Y Tay, D Tran, A Webson, Y Lu, X Chen, H Liu, D Huang, ... URL https://arxiv. org/abs/2303.03846, 2023	7	2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors