View article

[PDF] from arxiv.org

Do we really need scene-specific pose encoders?

Authors

Yoli Shavit, Ron Ferens

Publication date

2021/1/10

Conference

2020 25th International Conference on Pattern Recognition (ICPR)

Pages

3186-3192

Publisher

IEEE

Description

Visual pose regression models estimate the camera pose from a query image with a single forward pass. Current models learn pose encoding from an image using deep convolutional networks which are trained per scene. The resulting encoding is typically passed to a multi-layer perceptron in order to regress the pose. In this work, we propose that scene-specific pose encoders are not required for pose regression and that encodings trained for visual similarity can be used instead. In order to test our hypothesis, we take a shallow architecture of several fully connected layers and train it with pre-computed encodings from a generic image retrieval model. We find that these encodings are not only sufficient to regress the camera pose, but that, when provided to a branching fully connected architecture, a trained model can achieve competitive results and even surpass current state-of-the-art pose regressors in some …

Total citations

Cited by 24

20212022202320242 6 12 4

Scholar articles

Do we really need scene-specific pose encoders?

Y Shavit, R Ferens - 2020 25th International Conference on Pattern …, 2021