arXiv:2203.04895 Abstract | arXiv Analytics

arXiv:2203.04895 [cs.CV]Abstract References Reviews Resources

Joint Learning of Salient Object Detection, Depth Estimation and Contour Extraction

Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu

Published 2022-03-09Version 1

Benefiting from color independence, illumination invariance and location discrimination attributed by the depth map, it can provide important supplemental information for extracting salient objects in complex environments. However, high-quality depth sensors are expensive and can not be widely applied. While general depth sensors produce the noisy and sparse depth information, which brings the depth-based networks with irreversible interference. In this paper, we propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD). Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks. In this way, the depth information can be completed and purified. Moreover, we introduce a multi-modal filtered transformer (MFT) module, which equips with three modality-specific filters to generate the transformer-enhanced feature for each modality. The proposed model works in a depth-free style during the testing phase. Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time. And, the resulted depth map can help existing RGB-D SOD methods obtain significant performance gain.

Comments: Manuscript Version

Categories: cs.CV

Keywords: depth estimation, contour extraction, depth map, joint learning, multi-modal filtered transformer

Related articles: Most relevant | Search more

arXiv:1604.07480 [cs.CV] (Published 2016-04-25)

Joint Semantic Segmentation and Depth Estimation with Deep Convolutional Networks

Arsalan Mousavian, Hamed Pirsiavash, Jana Kosecka

arXiv:2410.11610 [cs.CV] (Published 2024-10-15)

Depth Estimation From Monocular Images With Enhanced Encoder-Decoder Architecture

Dabbrata Das, Argho Deb Das, Farhan Sadaf

arXiv:2003.08933 [cs.CV] (Published 2020-03-19)

Depth Estimation by Learning Triangulation and Densification of Sparse Points for Multi-view Stereo

Ayan Sinha, Zak Murez, James Bartolozzi, Vijay Badrinarayanan, Andrew Rabinovich

arXiv Analytics

arXiv:2203.04895 [cs.CV]Abstract References Reviews Resources

Joint Learning of Salient Object Detection, Depth Estimation and Contour Extraction

Links

Toolbox

arXiv:2203.04895 [cs.CV]AbstractReferencesReviewsResources

Joint Learning of Salient Object Detection, Depth Estimation and Contour Extraction

Links

Toolbox

arXiv:2203.04895 [cs.CV]Abstract References Reviews Resources