View article

PyTorchVideo: A deep learning library for video understanding

Authors

Haoqi Fan, Tullie Murrell, Heng Wang, Kalyan Vasudev Alwala, Yanghao Li, Yilei Li, Bo Xiong, Nikhila Ravi, Meng Li, Haichuan Yang, Jitendra Malik, Ross Girshick, Matt Feiszli, Aaron Adcock, Wan-Yen Lo, Christoph Feichtenhofer

Publication date

2021/10/17

Book

Proceedings of the 29th ACM international conference on multimedia

Pages

3783-3786

Description

We introduce PyTorchVideo, an open-source deep-learning library that provides a rich set of modular, efficient, and reproducible components for a variety of video understanding tasks, including classification, detection, self-supervised learning, and low-level processing. The library covers a full stack of video understanding tools including multimodal data loading, transformations, and models that reproduce state-of-the-art performance. PyTorchVideo further supports hardware acceleration that enables real-time inference on mobile devices. The library is based on PyTorch and can be used by any training framework; for example, PyTorchLightning, PySlowFast, or Classy Vision. PyTorchVideo is available at https://pytorchvideo.org/.

Total citations

Cited by 47

20212022202320241 22 12 12

Scholar articles

PyTorchVideo: A deep learning library for video understanding

H Fan, T Murrell, H Wang, KV Alwala, Y Li, Y Li… - Proceedings of the 29th ACM international conference …, 2021