Authors
Hakan Bilen, Basura Fernando, Efstratios Gavves, Andrea Vedaldi
Publication date
2017/11/2
Journal
IEEE transactions on pattern analysis and machine intelligence
Volume
40
Issue
12
Pages
2799-2813
Publisher
IEEE
Description
We introduce the concept of dynamic image, a novel compact representation of videos useful for video analysis, particularly in combination with convolutional neural networks (CNNs). A dynamic image encodes temporal data such as RGB or optical flow videos by using the concept of `rank pooling'. The idea is to learn a ranking machine that captures the temporal evolution of the data and to use the parameters of the latter as a representation. We call the resulting representation dynamic image because it summarizes the video dynamics in addition to appearance. This powerful idea allows to convert any video to an image so that existing CNN models pre-trained with still images can be immediately extended to videos. We also present an efficient approximate rank pooling operator that runs two orders of magnitude faster than the standard ones with any loss in ranking performance and can be formulated as a CNN …
Total citations
20172018201920202021202220232024424465055492717
Scholar articles
H Bilen, B Fernando, E Gavves, A Vedaldi - IEEE transactions on pattern analysis and machine …, 2017