Real-Time Monocular Object-Model Aware Sparse SLAM
Published 2018-09-24Version 1
Simultaneous Localization And Mapping (SLAM) is a fundamental problem in mobile robotics. While sparse point-based SLAM methods provide accurate camera localization, the generated maps lack semantic information. On the other hand, state of the art object detection methods provide rich information about entities present in the scene from a single image. This work incorporates a real-time deep-learned object detector to the monocular SLAM framework for representing generic objects as quadrics that permit detections to be seamlessly integrated while allowing the real-time performance. Finer reconstruction of an object, learned by a CNN network, is also incorporated and provides a shape prior for the quadric leading further refinement. To capture the dominant structure of the scene, additional planar landmarks are detected by a CNN-based plane detector and modelled as landmarks in the map. Experiments show that the introduced plane and object landmarks and the associated constraints, using the proposed monocular plane detector and incorporated object detector, significantly improve camera localization and lead to a richer semantically more meaningful map. The performance of our SLAM system is demonstrated in https://youtu.be/UMWXd4sHONw .