arXiv Analytics

Sign in

arXiv:1906.07207 [cs.RO]AbstractReferencesReviewsResources

Visual Navigation by Generating Next Expected Observations

Qiaoyun Wu, Dinesh Manocha, Jun Wang, Kai Xu

Published 2019-06-17Version 1

We propose a novel approach to visual navigation in unknown environments where the agent is guided by conceiving the next observations it expects to see after taking the next best action. This is achieved by learning a variational Bayesian model that generates the next expected observations (NEO) conditioned on the current observations of the agent and the target view. Our approach predicts the next best action based on the current observation and NEO. Our generative model is learned through optimizing a variational objective encompassing two key designs. First, the latent distribution is conditioned on current observations and target view, supporting model-based, target-driven navigation. Second, the latent space is modeled with a Mixture of Gaussians conditioned on the current observation and next best action. Our use of mixture-of-posteriors prior effectively alleviates the issue of over-regularized latent space, thus facilitating model generalization in novel scenes. Moreover, the NEO generation models the forward dynamics of the agent-environment interaction, which improves the quality of approximate inference and hence benefits data efficiency. We have conducted extensive evaluations on both real-world and synthetic benchmarks, and show that our model outperforms the state-of-the-art RL-based methods significantly in terms of success rate, data efficiency, and cross-scene generalization.

Comments: 8 content pages, 4 supplementary pages, Corresponding author: Kai Xu (kevin.kai.xu@gmail.com)
Categories: cs.RO, cs.CV, cs.LG
Related articles: Most relevant | Search more
arXiv:1912.04078 [cs.RO] (Published 2019-12-09)
Reinforcement Learning based Visual Navigation with Information-Theoretic Regularization
arXiv:2310.15020 [cs.RO] (Published 2023-10-23, updated 2023-12-04)
Invariance is Key to Generalization: Examining the Role of Representation in Sim-to-Real Transfer for Visual Navigation
arXiv:2210.14791 [cs.RO] (Published 2022-10-26)
ViNL: Visual Navigation and Locomotion Over Obstacles