View article

[PDF] from arxiv.org

Self-supervised multi-view learning via auto-encoding 3D transformations

Authors

Xiang Gao, Wei Hu, Guo-Jun Qi

Publication date

2023/9/18

Journal

ACM Transactions on Multimedia Computing, Communications and Applications

Volume

Issue

Pages

1-23

Publisher

ACM

Description

3D object representation learning is a fundamental challenge in computer vision to infer about the 3D world. Recent advances in deep learning have shown their efficiency in 3D object recognition, among which view-based methods have performed best so far. However, feature learning of multiple views in existing methods is mostly performed in a supervised fashion, which often requires a large amount of data labels with high costs. In contrast, self-supervised learning aims to learn multi-view feature representations without involving labeled data. To this end, we propose a novel self-supervised framework to learn Multi-View Transformation Equivariant Representations (MV-TER), exploring the equivariant transformations of a 3D object and its projected multiple views that we derive. Specifically, we perform a 3D transformation on a 3D object and obtain multiple views before and after the transformation via …

Total citations

Cited by 10

2022202320241 2 4

Scholar articles

Self-supervised multi-view learning via auto-encoding 3D transformations

X Gao, W Hu, GJ Qi - ACM Transactions on Multimedia Computing …, 2023