View article

[PDF] from lixirong.net

University of Amsterdam and Renmin University at TRECVID 2017: Searching Video, Detecting Events and Describing Video.

Authors

Cees GM Snoek, Xirong Li, Chaoxi Xu, Dennis C Koelma

Publication date

2017

Conference

TRECVID

Description

In this paper we summarize our TRECVID 2017 [1] video recognition and retrieval experiments. We participated in three tasks: video search, event detection and video description. For both video search and event detection we explore semantic representations based on VideoStory [8] and an ImageNet Shuffle [16], which thrive well in few-example regimes. For the video description task we experiment with a deep network that predicts a visual representation from a natural language description with Word2VisualVec [5], and use this space for the sentence matching. For generative description we enhance a neural image captioning model with Early Embedding and Late Reranking [4]. The 2017 edition of the TRECVID benchmark has been a fruitful participation for our joint-team, resulting in the best overall result for video search and event detection as well as the runner-up position for video description.

Total citations

Cited by 30

201720182019202020212022202320244 3 5 5 3 6 2 2

Scholar articles

University of Amsterdam and Renmin University at TRECVID 2017: Searching Video, Detecting Events and Describing Video.

CGM Snoek, X Li, C Xu, DC Koelma - TRECVID, 2017