arXiv Analytics

Sign in

arXiv:1312.3429 [cs.CV]AbstractReferencesReviewsResources

Unsupervised learning of depth and motion

Kishore Konda, Roland Memisevic

Published 2013-12-12, updated 2013-12-16Version 2

We present a model for the joint estimation of disparity and motion. The model is based on learning about the interrelations between images from multiple cameras, multiple frames in a video, or the combination of both. We show that learning depth and motion cues, as well as their combinations, from data is possible within a single type of architecture and a single type of learning algorithm, by using biologically inspired "complex cell" like units, which encode correlations between the pixels across image pairs. Our experimental results show that the learning of depth and motion makes it possible to achieve state-of-the-art performance in 3-D activity analysis, and to outperform existing hand-engineered 3-D motion features by a very large margin.

Related articles: Most relevant | Search more
arXiv:1505.00687 [cs.CV] (Published 2015-05-04)
Unsupervised Learning of Visual Representations using Videos
arXiv:2407.14620 [cs.CV] (Published 2024-07-19)
The Research of Group Re-identification from Multiple Cameras
arXiv:1703.09771 [cs.CV] (Published 2017-03-28)
Deep 6-DOF Tracking