Saining Xie

Cited by

	All	Since 2019
Citations	52153	49335
h-index	33	31
i10-index	44	41

17000

8500

4250

12750

2015201620172018201920202021202220232024147 336 722 1399 2262 3612 6464 10592 16137 10221

Public access

View all

14 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Zhuowen TuProfessor, Cognitive Science, Computer Science&Engineering, UC San DiegoVerified email at ucsd.edu
Kaiming HeAssociate Professor, EECS, MITVerified email at mit.edu
Ross GirshickResearch Scientist, Allen Institute for Artificial Intelligence (AI2)Verified email at allenai.org
Piotr DollárFAIRVerified email at fb.com
Xinlei ChenFAIR, MetaVerified email at meta.com
Haoqi FanFacebook AI ResearchVerified email at fb.com
Chen-Yu LeeResearch Scientist, GoogleVerified email at google.com
Christoph FeichtenhoferMeta, FAIRVerified email at fb.com
Zhuang LiuResearch Scientist, Meta AI ResearchVerified email at berkeley.edu
Zhengyou ZhangTencent AI Lab & Tencent Robotics XVerified email at tencent.com
Yanghao LiFacebook AI Research (FAIR)Verified email at fb.com
Yuxin WuVerified email at google.com
Chao-Yuan WuFormer Research Scientist @ FAIRVerified email at meta.com
Alexander KirillovResearch Scientist, Facebook AI Research (FAIR)Verified email at fb.com
Trevor DarrellProfessor of Computer Science, U.C. BerkeleyVerified email at eecs.berkeley.edu
Hanzi MaoResearch Scientist, NvidiaVerified email at nvidia.com
Kevin MurphyResearch Scientist, GoogleVerified email at google.com
Chen SunAssistant Professor, Brown UniversityVerified email at brown.edu
Jonathan HuangGoogleVerified email at google.com
Jiashi FengByteDance Inc.Verified email at bytedance.com

Saining Xie

Assistant Professor at the Courant Institute, New York University

Verified email at nyu.edu - Homepage

Computer Vision Machine Learning Deep Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs S Tong, E Brown, P Wu, S Woo, M Middepogu, SC Akula, J Yang, S Yang, ... arXiv preprint arXiv:2406.16860, 2024		2024
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning Y Zhai, H Bai, Z Lin, J Pan, S Tong, Y Zhou, A Suhr, S Xie, Y LeCun, Y Ma, ... arXiv preprint arXiv:2405.10292, 2024	3	2024
Masked Autoencoders for Computer Vision K He, P Dollar, R Girshick, S Xie, X Chen, LI Yanghao US Patent App. 17/875,210, 2024		2024
V-irl: Grounding virtual intelligence in real life J Yang, R Ding, E Brown, X Qi, S Xie arXiv preprint arXiv:2402.03310, 2024	4	2024
Deconstructing denoising diffusion models for self-supervised learning X Chen, Z Liu, S Xie, K He arXiv preprint arXiv:2401.14404, 2024	14	2024
Sit: Exploring flow and diffusion-based generative models with scalable interpolant transformers N Ma, M Goldstein, MS Albergo, NM Boffi, E Vanden-Eijnden, S Xie arXiv preprint arXiv:2401.08740, 2024	27	2024
MoDE: CLIP Data Experts via Clustering J Ma, PY Huang, S Xie, SW Li, L Zettlemoyer, SF Chang, WT Yih, H Xu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024	2	2024
Eyes wide shut? exploring the visual shortcomings of multimodal llms S Tong, Z Liu, Y Zhai, Y Ma, Y LeCun, S Xie Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024	54	2024
Image sculpting: Precise object editing with 3d geometry control J Yenphraphai, X Pan, S Liu, D Panozzo, S Xie Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024	5	2024
V*: Guided Visual Search as a Core Mechanism in Multimodal LLMs P Wu, S Xie arXiv preprint arXiv:2312.14135, 2023	25	2023
Demystifying CLIP Data H Xu, S Xie, XE Tan, PY Huang, R Howes, V Sharma, SW Li, G Ghosh, ... ICLR 2024, 2023	54	2023
Going Denser with Open-Vocabulary Part Segmentation P Sun, S Chen, C Zhu, F Xiao, P Luo, S Xie, Z Yan ICCV 2023, 2023	18	2023
CiT: Curation in Training for Effective Vision-Language Data H Xu, S Xie, PY Huang, L Yu, R Howes, G Ghosh, L Zettlemoyer, ... ICCV 2023, 2023	19	2023
ConvNeXt v2: Co-designing and scaling convnets with masked autoencoders S Woo, S Debnath, R Hu, X Chen, Z Liu, IS Kweon, S Xie CVPR 2023, 2023	315	2023
Scalable Diffusion Models with Transformers W Peebles, S Xie ICCV 2023, 2023	571	2023
Can we train vision and language zero-shot classification models without syntax? A Tejankar, M Sanjabi, B Wu, M Khabsa, S Xie, H Pirsiavash, H Firooz NeurIPS 2022 Workshop: Self-Supervised Learning-Theory and Practice, 2022, 2022	22*	2022
Exploring long-sequence masked autoencoders R Hu, S Debnath, S Xie, X Chen arXiv preprint arXiv:2210.07224, 2022	16	2022
A ConvNet for the 2020s Z Liu, H Mao, CY Wu, C Feichtenhofer, T Darrell, S Xie CVPR 2022, 11976-11986, 2022	4600	2022
SLIP: Self-supervision meets language-image pre-training N Mu, A Kirillov, D Wagner, S Xie ECCV 2022, 2022	364	2022
Masked feature prediction for self-supervised visual pre-training C Wei, H Fan, S Xie, CY Wu, A Yuille, C Feichtenhofer Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022	566	2022

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors