View article

[PDF] from thecvf.com

Towards unified depth and semantic prediction from a single image

Authors

Peng Wang, Xiaohui Shen, Zhe Lin, Scott Cohen, Brian Price, Alan L Yuille

Publication date

2015

Conference

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Pages

2800-2809

Description

Depth estimation and semantic segmentation are two fundamental problems in image understanding. While the two tasks are strongly correlated and mutually beneficial, they are usually solved separately or sequentially. Motivated by the complementary properties of the two tasks, we propose a unified framework for joint depth and semantic prediction. Given an image, we first use a trained Convolutional Neural Network (CNN) to jointly predict a global layout composed of pixel-wise depth values and semantic labels. By allowing for interactions between the depth and semantic information, the joint network provides more accurate depth prediction than a state-of-the-art CNN trained solely for depth prediction [5]. To further obtain fine-level details, the image is decomposed into local segments for region-level depth and semantic prediction under the guidance of global layout. Utilizing the pixel-wise global prediction and region-wise local prediction, we formulate the inference problem in a two-layer Hierarchical Conditional Random Field (HCRF) to produce the final depth and semantic map. As demonstrated in the experiments, our approach effectively leverages the advantages of both tasks and provides the state-of-the-art results.

Total citations

Cited by 578

20152016201720182019202020212022202320246 22 40 82 105 99 75 62 48 27

Scholar articles

Towards unified depth and semantic prediction from a single image

P Wang, X Shen, Z Lin, S Cohen, B Price, AL Yuille - Proceedings of the IEEE conference on computer …, 2015