A Vision-free Baseline for Multimodal Grammar Induction B Li, R Corona, K Mangalam, C Chen, D Flaherty, S Belongie, ... arXiv preprint arXiv:2212.10564, 2022 | 1 | 2022 |
Adaptive Human Trajectory Prediction via Latent Corridors N Thakkar, K Mangalam, A Bajcsy, J Malik arXiv preprint arXiv:2312.06653, 2023 | | 2023 |
Big little transformer decoder S Kim, K Mangalam, J Malik, MW Mahoney, A Gholami, K Keutzer arXiv preprint arXiv:2302.07863 1, 2023 | 22 | 2023 |
Bringing image scene structure to video via frame-clip consistency of object tokens E Ben Avraham, R Herzig, K Mangalam, A Bar, A Rohrbach, L Karlinsky, ... Advances in Neural Information Processing Systems 35, 26839-26855, 2022 | 12 | 2022 |
Diffusion models as masked autoencoders C Wei, K Mangalam, PY Huang, Y Li, H Fan, H Xu, H Wang, C Xie, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 31 | 2023 |
Disentangling human dynamics for pedestrian locomotion forecasting with noisy supervision K Mangalam, E Adeli, KH Lee, A Gaidon, JC Niebles IEEE Winter Conference on Applications of Computer Vision, 2020 | 71 | 2020 |
Do deep neural networks learn shallow learnable examples first? K Mangalam, VU Prabhu Understanding Deep Phenomena, International Conference on Machine Learning, 2019 | 42 | 2019 |
Do Vision and Language Encoders Represent the World Similarly? M Maniparambil, R Akshulakov, YAD Djilali, M El Amine Seddik, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | | 2024 |
Does unsupervised grammar induction need pixels? B Li*, R Corona*, K Mangalam*, C Chen, D Flaherty, S Belongie, ... arXiv preprint arXiv:2212.10564, 2022 | 3 | 2022 |
Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning C Zhao, S Liu, K Mangalam, G Qian, F Zohra, A Alghannam, J Malik, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 1 | 2024 |
Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning [Supplementary Material] C Zhao, S Liu, K Mangalam, G Qian, F Zohra, A Alghannam, J Malik, ... | | |
Ego4d: Around the world in 3,000 hours of egocentric video K Grauman, A Westbury, E Byrne, Z Chavis, A Furnari, R Girdhar, ... IEEE Conference on Computer Vision and Pattern Recognition, 2022 | 707 | 2022 |
Egoschema: A diagnostic benchmark for very long-form video language understanding K Mangalam, R Akshulakov, J Malik Advances in Neural Information Processing Systems 36, 2024 | 54 | 2024 |
From goals, waypoints & paths to long term human trajectory forecasting K Mangalam, Y An, H Girase, J Malik IEEE International Conference on Computer Vision, 2021 | 240 | 2021 |
Future person localization in first-person videos T Yagi, K Mangalam, R Yonetani, Y Sato IEEE Conference on Computer Vision and Pattern Recognition, 2018 | 210 | 2018 |
It is not the journey but the destination: Endpoint conditioned trajectory prediction K Mangalam, H Girase, S Agarwal, KH Lee, E Adeli, J Malik, A Gaidon European Conference on Computer Vision, 2020 | 439 | 2020 |
Latency matters: Real-time action forecasting transformer H Girase, N Agarwal, C Choi, K Mangalam Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 12 | 2023 |
Latency Matters: Real-Time Action Forecasting Transformer (Supplementary) H Girase, N Agarwal, C Choi, K Mangalam | | |
Latency-Aware Short-Term Video Action Anticipation and its Application in Trajectory Prediction H Girase, K Mangalam, J Malik | | 2023 |
Learning spontaneity to improve emotion recognition in speech K Mangalam, T Guha Interspeech, 2018 | 20 | 2018 |