arXiv:2305.10320 Abstract | arXiv Analytics

arXiv:2305.10320 [cs.CV]Abstract References Reviews Resources

CostFormer:Cost Transformer for Cost Aggregation in Multi-view Stereo

Weitao Chen, Hongbin Xu, Zhipeng Zhou, Yang Liu, Baigui Sun, Wenxiong Kang, Xuansong Xie

Published 2023-05-17Version 1

The core of Multi-view Stereo(MVS) is the matching process among reference and source pixels. Cost aggregation plays a significant role in this process, while previous methods focus on handling it via CNNs. This may inherit the natural limitation of CNNs that fail to discriminate repetitive or incorrect matches due to limited local receptive fields. To handle the issue, we aim to involve Transformer into cost aggregation. However, another problem may occur due to the quadratically growing computational complexity caused by Transformer, resulting in memory overflow and inference latency. In this paper, we overcome these limits with an efficient Transformer-based cost aggregation network, namely CostFormer. The Residual Depth-Aware Cost Transformer(RDACT) is proposed to aggregate long-range features on cost volume via self-attention mechanisms along the depth and spatial dimensions. Furthermore, Residual Regression Transformer(RRT) is proposed to enhance spatial attention. The proposed method is a universal plug-in to improve learning-based MVS methods.

Comments: Accepted by IJCAI-23

Categories: cs.CV

Keywords: multi-view stereo, costformer, residual depth-aware cost transformer, efficient transformer-based cost aggregation network, enhance spatial attention

Related articles: Most relevant | Search more

arXiv:2106.15328 [cs.CV] (Published 2021-06-18)

Deep Learning for Multi-View Stereo via Plane Sweep: A Survey

Qingtian Zhu

arXiv:2401.11673 [cs.CV] (Published 2024-01-22)

MVSFormer++: Revealing the Devil in Transformer's Details for Multi-View Stereo

Chenjie Cao, Xinlin Ren, Yanwei Fu

arXiv:2205.14320 [cs.CV] (Published 2022-05-28)

RIAV-MVS: Recurrent-Indexing an Asymmetric Volume for Multi-View Stereo

Changjiang Cai, Pan Ji, Yi Xu

arXiv Analytics

arXiv:2305.10320 [cs.CV]Abstract References Reviews Resources

CostFormer:Cost Transformer for Cost Aggregation in Multi-view Stereo

Links

Toolbox

arXiv:2305.10320 [cs.CV]AbstractReferencesReviewsResources

CostFormer:Cost Transformer for Cost Aggregation in Multi-view Stereo

Links

Toolbox

arXiv:2305.10320 [cs.CV]Abstract References Reviews Resources