Follow
Saining Xie
Saining Xie
Assistant Professor at the Courant Institute, New York University
Verified email at nyu.edu - Homepage
Title
Cited by
Year
On Scaling Up 3D Gaussian Splatting Training
H Zhao, H Weng, D Lu, A Li, J Li, A Panda, S Xie
arXiv preprint arXiv:2406.18533, 2024
2024
Cambrian-1: A fully open, vision-centric exploration of multimodal llms
S Tong, E Brown, P Wu, S Woo, M Middepogu, SC Akula, J Yang, S Yang, ...
arXiv preprint arXiv:2406.16860, 2024
62024
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
Y Zhai, H Bai, Z Lin, J Pan, S Tong, Y Zhou, A Suhr, S Xie, Y LeCun, Y Ma, ...
arXiv preprint arXiv:2405.10292, 2024
52024
Masked Autoencoders for Computer Vision
K He, P Dollar, R Girshick, S Xie, X Chen, LI Yanghao
US Patent App. 17/875,210, 2024
2024
V-IRL: Grounding virtual intelligence in real life
J Yang, R Ding, E Brown, X Qi, S Xie
ECCV 2024, 2024
72024
Deconstructing denoising diffusion models for self-supervised learning
X Chen, Z Liu, S Xie, K He
arXiv preprint arXiv:2401.14404, 2024
172024
SIT: Exploring flow and diffusion-based generative models with scalable interpolant transformers
N Ma, M Goldstein, MS Albergo, NM Boffi, E Vanden-Eijnden, S Xie
ECCV 2024, 2024
302024
What Does a Visual Formal Analysis of the World's 500 Most Famous Paintings Tell Us About Multimodal LLMs?
M Tao, S Xie
ICLR 2024, 2024
2024
MoDE: CLIP Data Experts via Clustering
J Ma, PY Huang, S Xie, SW Li, L Zettlemoyer, SF Chang, WT Yih, H Xu
CVPR 2024, 2024
32024
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
S Tong, Z Liu, Y Zhai, Y Ma, Y LeCun, S Xie
CVPR 2024, 2024
722024
Image sculpting: Precise object editing with 3d geometry control
J Yenphraphai, X Pan, S Liu, D Panozzo, S Xie
CVPR 2024, 2024
52024
V*: Guided Visual Search as a Core Mechanism in Multimodal LLMs
P Wu, S Xie
CVPR 2024, 2024
302024
Demystifying CLIP Data
H Xu, S Xie, XE Tan, PY Huang, R Howes, V Sharma, SW Li, G Ghosh, ...
ICLR 2024, 2023
632023
Going Denser with Open-Vocabulary Part Segmentation
P Sun, S Chen, C Zhu, F Xiao, P Luo, S Xie, Z Yan
ICCV 2023, 2023
262023
CiT: Curation in Training for Effective Vision-Language Data
H Xu, S Xie, PY Huang, L Yu, R Howes, G Ghosh, L Zettlemoyer, ...
ICCV 2023, 2023
202023
ConvNeXt v2: Co-designing and scaling convnets with masked autoencoders
S Woo, S Debnath, R Hu, X Chen, Z Liu, IS Kweon, S Xie
CVPR 2023, 2023
503*2023
Scalable Diffusion Models with Transformers
W Peebles, S Xie
ICCV 2023, 2023
6812023
Can we train vision and language zero-shot classification models without syntax?
A Tejankar, M Sanjabi, B Wu, M Khabsa, S Xie, H Pirsiavash, H Firooz
NeurIPS 2022 Workshop: Self-Supervised Learning-Theory and Practice, 2022, 2022
22*2022
Exploring long-sequence masked autoencoders
R Hu, S Debnath, S Xie, X Chen
arXiv preprint arXiv:2210.07224, 2022
162022
A ConvNet for the 2020s
Z Liu, H Mao, CY Wu, C Feichtenhofer, T Darrell, S Xie
CVPR 2022, 11976-11986, 2022
49822022
The system can't perform the operation now. Try again later.
Articles 1–20