Authors
Guangyu Zhong, Yi-Hsuan Tsai, Ming-Hsuan Yang
Publication date
2017
Conference
Computer Vision–ACCV 2016: 13th Asian Conference on Computer Vision, Taipei, Taiwan, November 20-24, 2016, Revised Selected Papers, Part I 13
Pages
20-36
Publisher
Springer International Publishing
Description
In this paper, we propose a scene co-parsing framework to assign pixel-wise semantic labels in weakly-labeled videos, i.e., only video-level category labels are given. To exploit rich semantic information, we first collect all videos that share the same video-level labels and segment them into supervoxels. We then select representative supervoxels for each category via a supervoxel ranking process. This ranking problem is formulated with a submodular objective function and a scene-object classifier is incorporated to distinguish scenes and objects. To assign each supervoxel a semantic label, we match each supervoxel to these selected representatives in the feature domain. Each supervoxel is then associated with a series of category potentials and assigned to a semantic label with the maximum one. The proposed co-parsing framework extends scene parsing from single images to videos and exploits …
Total citations
20182019202020212022202331521
Scholar articles
G Zhong, YH Tsai, MH Yang - Computer Vision–ACCV 2016: 13th Asian Conference …, 2017