Follow
Jiahui Yu
Jiahui Yu
Research Scientist, OpenAI
Verified email at openai.com - Homepage
Title
Cited by
Year
Vector-Quantized Image Modeling
YU Jiahui, X Li, H Zhang, V Vasudevan, AYS Ku, JM Baldridge, Y Xu, ...
US Patent App. 18/520,083, 2024
2024
Module-wise adaptive distillation for multimodality foundation models
C Liang, J Yu, MH Yang, M Brown, Y Cui, T Zhao, B Gong, T Zhou
Advances in Neural Information Processing Systems 36, 2024
22024
Parrot: Pareto-optimal multi-reward reinforcement learning framework for text-to-image generation
SH Lee, Y Li, J Ke, I Yoo, H Zhang, J Yu, Q Wang, F Deng, G Entis, J He, ...
arXiv preprint arXiv:2401.05675, 2024
52024
De-diffusion makes text a strong cross-modal interface
C Wei, C Liu, S Qiao, Z Zhang, A Yuille, J Yu
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
22024
Gemini: a family of highly capable multimodal models
G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ...
arXiv preprint arXiv:2312.11805, 2023
11402023
Pyramid attention network for image restoration
Y Mei, Y Fan, Y Zhang, J Yu, Y Zhou, D Liu, Y Fu, TS Huang, H Shi
International Journal of Computer Vision 131 (12), 3207-3225, 2023
1732023
IG Captioner: Information Gain Captioners are Strong Zero-shot Classifiers
C Yang, S Qiao, Y Cao, Y Zhang, T Zhu, A Yuille, J Yu
arXiv preprint arXiv:2311.17072, 2023
2023
Contrastive captioning neural networks
YU Jiahui, Z Wang, V Vasudevan, HM Yeung, SMS Tarzjani, Y Wu
US Patent App. 18/141,340, 2023
2023
Combined scaling for zero-shot transfer learning
H Pham, Z Dai, G Ghiasi, K Kawaguchi, H Liu, AW Yu, J Yu, YT Chen, ...
Neurocomputing 555, 126658, 2023
1512023
Systems and Methods for Pretraining Image Processing Models
Z Wang, YU Jiahui, Y Cao, W Yu, Z Dai
US Patent App. 17/685,774, 2023
12023
Systems and Methods for Training Dual-Mode Machine-Learned Speech Recognition Models
YU Jiahui, R Pang, W Han, A Gulati, CC Chiu, B Li, TN Sainath, Y Hu
US Patent App. 18/011,571, 2023
2023
Audiopalm: A large language model that can speak and listen
PK Rubenstein, C Asawaroengchai, DD Nguyen, A Bapna, Z Borsos, ...
arXiv preprint arXiv:2306.12925, 2023
1042023
Palm 2 technical report
R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ...
arXiv preprint arXiv:2305.10403, 2023
10982023
Optimizing Inference Performance for Conformer
TN Sainath, R Botros, A Gulati, K Choromanski, R Pang, T Strohman, ...
US Patent App. 17/936,547, 2023
12023
Predicting Word Boundaries for On-Device Batching of End-To-End Speech Recognition Models
SJP Bijwadia, TN Sainath, YU Jiahui, S Chang, Y He
US Patent App. 17/934,184, 2023
2023
Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR
R Botros, A Gulati, TN Sainath, K Choromanski, R Pang, T Strohman, ...
arXiv preprint arXiv:2304.00171, 2023
22023
Cobit: A contrastive bi-directional image-text generation model
H You, M Guo, Z Wang, KW Chang, J Baldridge, J Yu
arXiv preprint arXiv:2303.13455, 2023
142023
Noise2music: Text-conditioned music generation with diffusion models
Q Huang, DS Park, T Wang, TI Denk, A Ly, N Chen, Z Zhang, Z Zhang, ...
arXiv preprint arXiv:2302.03917, 2023
1232023
Gemini: A family of highly capable multimodal models
R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, J Schalkwyk, ...
arXiv preprint arXiv:2312.11805 1, 2023
1112023
Vila: Learning image aesthetics from user comments with vision-language pretraining
J Ke, K Ye, J Yu, Y Wu, P Milanfar, F Yang
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
382023
The system can't perform the operation now. Try again later.
Articles 1–20