Follow
Xian Li
Title
Cited by
Year
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
S Sukhbaatar, O Golovneva, V Sharma, H Xu, XV Lin, B Rozière, J Kahn, ...
arXiv preprint arXiv:2403.07816, 2024
2024
Self-rewarding language models
W Yuan, RY Pang, K Cho, S Sukhbaatar, J Xu, J Weston
arXiv preprint arXiv:2401.10020, 2024
1202024
Self-alignment with instruction backtranslation
X Li, P Yu, C Zhou, T Schick, L Zettlemoyer, O Levy, J Weston, M Lewis
arXiv preprint arXiv:2308.06259, 2023
1072023
Towards a unified view of sparse feed-forward network in pretraining large language model
ZL Liu, T Dettmers, XV Lin, V Stoyanov, X Li
arXiv preprint arXiv:2305.13999, 2023
22023
Large language model programs
I Schlag, S Sukhbaatar, A Celikyilmaz, W Yih, J Weston, J Schmidhuber, ...
arXiv preprint arXiv:2305.05364, 2023
142023
ToKen: Task decomposition and knowledge infusion for few-shot hate speech detection
B AlKhamissi, F Ladhak, S Iyer, V Stoyanov, Z Kozareva, X Li, P Fung, ...
arXiv preprint arXiv:2205.12495, 2022
182022
Lifting the curse of multilinguality by pre-training modular transformers
J Pfeiffer, N Goyal, XV Lin, X Li, J Cross, S Riedel, M Artetxe
arXiv preprint arXiv:2205.06266, 2022
902022
Opt: Open pre-trained transformer language models
S Zhang, S Roller, N Goyal, M Artetxe, M Chen, S Chen, C Dewan, ...
arXiv preprint arXiv:2205.01068, 2022
18802022
Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, and Luke Zettlemoyer. 2022
S Zhang, S Roller, N Goyal, M Artetxe, M Chen, S Chen, C Dewan, ...
Opt: Open pretrained transformer language models 1, 2022
6832022
Efficient language modeling with sparse all-mlp
P Yu, M Artetxe, M Ott, S Shleifer, H Gong, V Stoyanov, X Li
arXiv preprint arXiv:2203.06850, 2022
112022
Efficient large scale language modeling with mixtures of experts
M Artetxe, S Bhosale, N Goyal, T Mihaylov, M Ott, S Shleifer, XV Lin, J Du, ...
arXiv preprint arXiv:2112.10684, 2021
802021
Few-shot learning with multilingual language models
XV Lin, T Mihaylov, M Artetxe, T Wang, S Chen, D Simig, M Ott, N Goyal, ...
arXiv preprint arXiv:2112.10668, 2021
321*2021
Robust optimization for multilingual translation with imbalanced data
X Li, H Gong
Advances in Neural Information Processing Systems 34, 25086-25099, 2021
192021
Pay better attention to attention: Head selection in multilingual and multi-domain sequence modeling
H Gong, Y Tang, J Pino, X Li
Advances in Neural Information Processing Systems 34, 2668-2681, 2021
92021
Do language models have beliefs? methods for detecting, updating, and visualizing model beliefs
P Hase, M Diab, A Celikyilmaz, X Li, Z Kozareva, V Stoyanov, M Bansal, ...
arXiv preprint arXiv:2111.13654, 2021
662021
Distributionally robust multilingual machine translation
C Zhou, D Levy, X Li, M Ghazvininejad, G Neubig
arXiv preprint arXiv:2109.04020, 2021
232021
Multilingual translation from denoising pre-training
Y Tang, C Tran, X Li, PJ Chen, N Goyal, V Chaudhary, J Gu, A Fan
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 …, 2021
1252021
Fst: the fair speech translation system for the iwslt21 multilingual shared task
Y Tang, H Gong, X Li, C Wang, J Pino, H Schwenk, N Goyal
arXiv preprint arXiv:2107.06959, 2021
62021
Improving speech translation by understanding and learning from the auxiliary text translation task
Y Tang, J Pino, X Li, C Wang, D Genzel
arXiv preprint arXiv:2107.05782, 2021
652021
Gender bias amplification during speed-quality optimization in neural machine translation
A Renduchintala, D Diaz, K Heafield, X Li, M Diab
arXiv preprint arXiv:2106.00169, 2021
422021
The system can't perform the operation now. Try again later.
Articles 1–20