arXiv:1703.00956 Abstract | arXiv Analytics

arXiv:1703.00956 [cs.LG]Abstract References Reviews Resources

A Laplacian Framework for Option Discovery in Reinforcement Learning

Marlos C. Machado, Marc G. Bellemare, Michael Bowling

Published 2017-03-02Version 1

Representation learning and option discovery are two of the biggest challenges in reinforcement learning (RL). Proto-RL is a well known approach for representation learning in MDPs. The representations learned with this framework are called proto-value functions (PVFs). In this paper we address the option discovery problem by showing how PVFs implicitly define options. We do it by introducing eigenpurposes, intrinsic reward functions derived from the learned representations. The options discovered from eigenpurposes traverse the principal directions of the state space. They are useful for multiple tasks because they are independent of the agents' intentions. Moreover, by capturing the diffusion process of a random walk, different options act at different time scales, making them helpful for exploration strategies. We demonstrate features of eigenpurposes in traditional tabular domains as well as in Atari 2600 games.

Comments: Version submitted to the 34th International Conference on Machine Learning (ICML)

Categories: cs.LG, cs.AI

Keywords: reinforcement learning, laplacian framework, option discovery problem, pvfs implicitly define options, traditional tabular domains

Tags: conference paper

Related articles: Most relevant | Search more

arXiv:1809.07066 [cs.LG] (Published 2018-09-19)

Prosocial or Selfish? Agents with different behaviors for Contract Negotiation using Reinforcement Learning

Vishal Sunder, Lovekesh Vig, Arnab Chatterjee, Gautam Shroff

arXiv:1809.01560 [cs.LG] (Published 2018-09-05)

Reinforcement Learning under Threats

Víctor Gallego, Roi Naveiro, David Ríos Insua

arXiv:1402.0560 [cs.LG] (Published 2014-02-04)