Authors
Hovhannes Margaryan, Daniil Hayrapetyan, Wenyan Cong, Zhangyang Wang, Humphrey Shi
Publication date
2024/6/14
Conference
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops
Description
This paper presents an innovative approach to multi-view generation that can be comprehensively controlled over both perspectives (viewpoints) and non-perspective attributes (such as depth maps). Our controllable dual-branch pipeline named Depth Guided Branched Diffusion (DGBD) leverages depth maps and perspective information to generate images from alternative viewpoints while preserving shape and size fidelity. In the first DGBD branch we fine-tune a pre-trained diffusion model on multi-view data introducing a regularized batch-aware self-attention mechanism for multi-view consistency and generalization. Direct control over perspective is then achieved through cross-attention conditioned on camera position. Meanwhile the second DGBD branch introduces non-perspective control using depth maps. Qualitative and quantitative experiments validate the effectiveness of our approach surpassing or matching the performance of state-of-the-art novel view and multi-view synthesis methods.
Scholar articles
H Margaryan, D Hayrapetyan, W Cong, Z Wang, H Shi - Proceedings of the IEEE/CVF Conference on Computer …, 2024