Authors
Xiang Gao, Wei Hu, Guo-Jun Qi
Publication date
2023/9/18
Journal
ACM Transactions on Multimedia Computing, Communications and Applications
Volume
20
Issue
1
Pages
1-23
Publisher
ACM
Description
3D object representation learning is a fundamental challenge in computer vision to infer about the 3D world. Recent advances in deep learning have shown their efficiency in 3D object recognition, among which view-based methods have performed best so far. However, feature learning of multiple views in existing methods is mostly performed in a supervised fashion, which often requires a large amount of data labels with high costs. In contrast, self-supervised learning aims to learn multi-view feature representations without involving labeled data. To this end, we propose a novel self-supervised framework to learn Multi-View Transformation Equivariant Representations (MV-TER), exploring the equivariant transformations of a 3D object and its projected multiple views that we derive. Specifically, we perform a 3D transformation on a 3D object and obtain multiple views before and after the transformation via …
Total citations
202220232024124
Scholar articles
X Gao, W Hu, GJ Qi - ACM Transactions on Multimedia Computing …, 2023