arXiv Analytics

Sign in

arXiv:2104.11225 [cs.CV]AbstractReferencesReviewsResources

Pri3D: Can 3D Priors Help 2D Representation Learning?

Ji Hou, Saining Xie, Benjamin Graham, Angela Dai, Matthias Nießner

Published 2021-04-22Version 1

Recent advances in 3D perception have shown impressive progress in understanding geometric structures of 3Dshapes and even scenes. Inspired by these advances in geometric understanding, we aim to imbue image-based perception with representations learned under geometric constraints. We introduce an approach to learn view-invariant,geometry-aware representations for network pre-training, based on multi-view RGB-D data, that can then be effectively transferred to downstream 2D tasks. We propose to employ contrastive learning under both multi-view im-age constraints and image-geometry constraints to encode3D priors into learned 2D representations. This results not only in improvement over 2D-only representation learning on the image-based tasks of semantic segmentation, instance segmentation, and object detection on real-world in-door datasets, but moreover, provides significant improvement in the low data regime. We show a significant improvement of 6.0% on semantic segmentation on full data as well as 11.9% on 20% data against baselines on ScanNet.

Related articles: Most relevant | Search more
arXiv:1312.7715 [cs.CV] (Published 2013-12-30, updated 2014-07-31)
Constrained Parametric Proposals and Pooling Methods for Semantic Segmentation in RGB-D Images
arXiv:1807.02917 [cs.CV] (Published 2018-07-09)
Attention to Refine through Multi-Scales for Semantic Segmentation
arXiv:1805.08403 [cs.CV] (Published 2018-05-22)
Autofocus Layer for Semantic Segmentation