Quentin Anthony

Cited by

	All	Since 2019
Citations	2145	2144
h-index	14	14
i10-index	15	15

1300

650

325

975

2020202120222023202413 28 132 712 1251

Public access

View all

16 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Stella BidermanBooz Allen Hamilton, EleutherAIVerified email at bah.com
Hari SubramoniThe Ohio State UniversityVerified email at cse.ohio-state.edu
Dhabaleswar K. PandaProfessor of Computer Science, The Ohio State UniversityVerified email at cse.ohio-state.edu
Hailey SchoelkopfResearcher, EleutherAIVerified email at eleuther.ai
Aamir ShafiResearch Scientist, Ohio State UniversityVerified email at osu.edu
Ammar Ahmad AwanMicrosoftVerified email at osu.edu

Quentin Anthony

PhD Student, Ohio State University

Verified email at osu.edu - Homepage

HPC Deep Learning Parallel Computing


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Gpt-neox-20b: An open-source autoregressive language model S Black, S Biderman, E Hallahan, Q Anthony, L Gao, L Golding, H He, ... Proceedings of the ACL Workshop on Challenges & Perspectives in Creating …, 2022	708	2022
Pythia: A suite for analyzing large language models across training and scaling S Biderman, H Schoelkopf, Q Anthony, H Bradley, K O'Brien, E Hallahan, ... International conference on machine learning (ICML), 2023	661	2023
Rwkv: Reinventing rnns for the transformer era B Peng, E Alcaide, Q Anthony, A Albalak, S Arcadinho, S Biderman, ... arXiv preprint arXiv:2305.13048, 2023	318*	2023
Emergent and Predictable Memorization in Large Language Models S Biderman, US Prashanth, L Sutawika, H Schoelkopf, Q Anthony, ... https://arxiv.org/pdf/2304.11158.pdf, 2023	95	2023
Gems: Gpu-enabled memory-aware model-parallelism system for distributed dnn training A Jain, AA Awan, AM Aljuhani, JM Hashmi, QG Anthony, H Subramoni, ... SC20: International Conference for High Performance Computing, Networking …, 2020	52	2020
Performance characterization of dnn training using tensorflow and pytorch on modern clusters A Jain, AA Awan, Q Anthony, H Subramoni, DKDK Panda 2019 IEEE International Conference on Cluster Computing (CLUSTER), 1-11, 2019	44	2019
Continual Pre-Training of Large Language Models: How to (re) warm your model? K Gupta, B Thérien, A Ibrahim, ML Richter, Q Anthony, E Belilovsky, I Rish, ...	41	2023
GPT-NeoX: Large scale autoregressive language modeling in pytorch A Andonian, Q Anthony, S Biderman, S Black, P Gali, L Gao, E Hallahan, ...	34*	2021
Simple and scalable strategies to continually pre-train large language models A Ibrahim, B Thérien, K Gupta, ML Richter, Q Anthony, T Lesort, ... arXiv preprint arXiv:2403.08763, 2024	22	2024
Blackmamba: Mixture of experts for state-space models Q Anthony, Y Tokpanov, P Glorioso, B Millidge arXiv preprint arXiv:2402.01771, 2024	20	2024
trlX: A framework for large scale reinforcement learning from human feedback A Havrilla, M Zhuravinskyi, D Phung, A Tiwari, J Tow, S Biderman, ... Proceedings of the 2023 Conference on Empirical Methods in Natural Language …, 2023	18	2023
Accelerating mpi all-to-all communication with online compression on modern gpu clusters Q Zhou, P Kousha, Q Anthony, K Shafie Khorassani, A Shafi, ... International Conference on High Performance Computing, 3-25, 2022	17	2022
Eagle and finch: Rwkv with matrix-valued states and dynamic recurrence B Peng, D Goldstein, Q Anthony, A Albalak, E Alcaide, S Biderman, ... arXiv preprint arXiv:2404.05892, 2024	16	2024
Adaptive and hierarchical large message all-to-all communication algorithms for large-scale dense gpu systems KS Khorassani, CH Chu, QG Anthony, H Subramoni, DK Panda 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet …, 2021	15	2021
Hypar-flow: Exploiting mpi and keras for scalable hybrid-parallel dnn training using tensorflow AA Awan, A Jain, Q Anthony, H Subramoni, DK Panda arXiv preprint arXiv:1911.05146, 2019	14*	2019
Accelerating distributed deep learning training with compression assisted allgather and reduce-scatter communication Q Zhou, Q Anthony, L Xu, A Shafi, M Abduljabbar, H Subramoni, ... 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2023	9	2023
Efficient training of semantic image segmentation on summit using horovod and mvapich2-gdr Q Anthony, AA Awan, A Jain, H Subramoni, DKDK Panda 2020 IEEE International Parallel and Distributed Processing Symposium …, 2020	8	2020
Mcr-dl: Mix-and-match communication runtime for deep learning Q Anthony, AA Awan, J Rasley, Y He, A Shafi, M Abduljabbar, ... 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2023	7	2023
Zamba: A Compact 7B SSM Hybrid Model P Glorioso, Q Anthony, Y Tokpanov, J Whittington, J Pilault, A Ibrahim, ... arXiv preprint arXiv:2405.16712, 2024	6	2024
Accelerating broadcast communication with gpu compression for deep learning workloads Q Zhou, Q Anthony, A Shafi, H Subramoni, DKDK Panda 2022 IEEE 29th International Conference on High Performance Computing, Data …, 2022	6	2022

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors