arXiv Analytics

Sign in

arXiv:1810.06530 [cs.LG]AbstractReferencesReviewsResources

Successor Uncertainties: exploration and uncertainty in temporal difference learning

David Janz, Jiri Hron, José Miguel Hernández-Lobato, Katja Hofmann, Sebastian Tschiatschek

Published 2018-10-15Version 1

We consider the problem of balancing exploration and exploitation in sequential decision making problems. To explore efficiently, it is vital to consider the uncertainty over all consequences of a decision, and not just those that follow immediately; the uncertainties involved need to be propagated according to the dynamics of the problem. To this end, we develop Successor Uncertainties, a probabilistic model for the state-action value function of a Markov Decision Process that propagates uncertainties in a coherent and scalable way. We relate our approach to other classical and contemporary methods for exploration and present an empirical analysis.

Related articles: Most relevant | Search more
arXiv:2006.07464 [cs.LG] (Published 2020-06-12)
Hypermodels for Exploration
arXiv:2301.12822 [cs.LG] (Published 2023-01-30)
Evaluating COVID-19 vaccine allocation policies using Bayesian $m$-top exploration
arXiv:2310.08702 [cs.LG] (Published 2023-10-12)
ELDEN: Exploration via Local Dependencies