Follow
Vijay Anand Korthikanti
Vijay Anand Korthikanti
Principal Research Scientist, Nvidia
Verified email at uiuc.edu
Title
Cited by
Cited by
Year
Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
arXiv preprint arXiv:2201.11990, 2022
5822022
Efficient large-scale language model training on gpu clusters using megatron-lm
D Narayanan, M Shoeybi, J Casper, P LeGresley, M Patwary, ...
Proceedings of the International Conference for High Performance Computing …, 2021
5452021
Synthesizing geometry constructions
S Gulwani, VA Korthikanti, A Tiwari
ACM SIGPLAN Notices 46 (6), 50-61, 2011
1722011
Reducing activation recomputation in large transformer models
VA Korthikanti, J Casper, S Lym, L McAfee, M Andersch, M Shoeybi, ...
Proceedings of Machine Learning and Systems 5, 341-353, 2023
1582023
Rewon Child, Reza Yazdani Aminabadi, Julie Bernauer, Xia Song, Mohammad Shoeybi, Yuxiong He, Michael Houston, Saurabh Tiwary, and Bryan Catanzaro
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
Using deepspeed and megatron to train megatron-turing nlg 530b, a large …, 2022
1282022
Towards optimizing energy costs of algorithms for shared memory architectures
VA Korthikanti, G Agha
Proceedings of the twenty-second annual ACM symposium on Parallelism in …, 2010
742010
Reasoning about MDPs as transformers of probability distributions
VA Korthikanti, M Viswanathan, G Agha, YM Kwon
2010 Seventh International Conference on the Quantitative Evaluation of …, 2010
562010
Analysis of parallel algorithms for energy conservation in scalable multicore architectures
VA Korthikanti, G Agha
2009 International Conference on Parallel Processing, 212-219, 2009
562009
Model checking MDPs with a unique compact invariant set of distributions
R Chadha, VA Korthikanti, M Viswanathan, G Agha, YM Kwon
2011 Eighth International Conference on Quantitative Evaluation of SysTems …, 2011
252011
Fair k mutual exclusion algorithm for peer to peer systems
VA Reddy, P Mittal, I Gupta
2008 The 28th International Conference on Distributed Computing Systems, 655-662, 2008
202008
Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model. arXiv 2022
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
arXiv preprint arXiv:2201.11990, 0
17
Re-vilm: Retrieval-augmented visual language model for zero and few-shot image captioning
Z Yang, W Ping, Z Liu, V Korthikanti, W Nie, DA Huang, L Fan, Z Yu, S Lan, ...
arXiv preprint arXiv:2302.04858, 2023
162023
Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model. arXiv
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
Preprint published online January 28, 2022
132022
Energy-performance trade-off analysis of parallel algorithms
VA Korthikanti, G Agha
USENIX Workshop on Hot Topics in Parallelism (HotPar), 2010
132010
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
A large-scale generative language model, 2022
122022
On the energy complexity of parallel algorithms
VA Korthikanti, G Agha, M Greenstreet
2011 International Conference on Parallel Processing, 562-570, 2011
112011
Avoiding energy wastage in parallel applications
VA Korthikanti, G Agha
International Conference on Green Computing, 149-163, 2010
112010
An efficient algorithm to reduce test power consumption by scan cell and scan vector reordering
KVA Reddy, S Chattopadahyay
Proceedings of the IEEE INDICON 2004. First India Annual Conference, 2004 …, 2004
102004
An Empirical Study of Mamba-based Language Models
R Waleffe, W Byeon, D Riach, B Norick, V Korthikanti, T Dao, A Gu, ...
arXiv preprint arXiv:2406.07887, 2024
92024
Energy bounded scalability analysis of parallel algorithms
VA Korthikanti, GA Agha
92009
The system can't perform the operation now. Try again later.
Articles 1–20