arXiv:1810.06530 Abstract | arXiv Analytics

arXiv:1810.06530 [cs.LG]Abstract References Reviews Resources

Successor Uncertainties: exploration and uncertainty in temporal difference learning

David Janz, Jiri Hron, José Miguel Hernández-Lobato, Katja Hofmann, Sebastian Tschiatschek

Published 2018-10-15Version 1

We consider the problem of balancing exploration and exploitation in sequential decision making problems. To explore efficiently, it is vital to consider the uncertainty over all consequences of a decision, and not just those that follow immediately; the uncertainties involved need to be propagated according to the dynamics of the problem. To this end, we develop Successor Uncertainties, a probabilistic model for the state-action value function of a Markov Decision Process that propagates uncertainties in a coherent and scalable way. We relate our approach to other classical and contemporary methods for exploration and present an empirical analysis.

Categories: cs.LG, stat.ML

Keywords: uncertainty, successor uncertainties, temporal difference learning, exploration, markov decision process

Related articles: Most relevant | Search more

arXiv:2006.07464 [cs.LG] (Published 2020-06-12)

Hypermodels for Exploration

Vikranth Dwaracherla, Xiuyuan Lu, Morteza Ibrahimi, Ian Osband, Zheng Wen, Benjamin Van Roy

arXiv:2302.04009 [cs.LG] (Published 2023-02-08)

Investigating the role of model-based learning in exploration and transfer

Jacob Walker, Eszter Vértes, Yazhe Li, Gabriel Dulac-Arnold, Ankesh Anand, Théophane Weber, Jessica B. Hamrick

arXiv:2301.12822 [cs.LG] (Published 2023-01-30)

Evaluating COVID-19 vaccine allocation policies using Bayesian $m$-top exploration

Alexandra Cimpean, Timothy Verstraeten, Lander Willem, Niel Hens, Ann Nowé, Pieter Libin