arXiv:1506.03379 Abstract | arXiv Analytics

arXiv:1506.03379 [cs.LG]Abstract References Reviews Resources

The Online Discovery Problem and Its Application to Lifelong Reinforcement Learning

Published 2015-06-10Version 1

Transferring knowledge across a sequence of related tasks is an important challenge in reinforcement learning. Despite much encouraging empirical evidence that shows benefits of transfer, there has been very little theoretical analysis. In this paper, we study a class of lifelong reinforcement-learning problems: the agent solves a sequence of tasks modeled as finite Markov decision processes (MDPs), each of which is from a finite set of MDPs with the same state/action spaces and different transition/reward functions. Inspired by the need for cross-task exploration in lifelong learning, we formulate a novel online discovery problem and give an optimal learning algorithm to solve it. Such results allow us to develop a new lifelong reinforcement-learning algorithm, whose overall sample complexity in a sequence of tasks is much smaller than that of single-task learning, with high probability, even if the sequence of tasks is generated by an adversary. Benefits of the algorithm are demonstrated in a simulated problem.

Comments: 17 pages

Categories: cs.LG, cs.AI

Keywords: lifelong reinforcement learning, application, novel online discovery problem, finite markov decision processes, overall sample complexity

Related articles: Most relevant | Search more

arXiv:1502.04469 [cs.LG] (Published 2015-02-16)

Classification and its application to drug-target interaction prediction

Jian-Ping Mei, Chee-Keong Kwoh, Peng Yang, Xiao-Li Li

arXiv:1109.5078 [cs.LG] (Published 2011-09-23)

Application of distances between terms for flat and hierarchical data

Jorge-Alonso Bedoya-Puerta, Jose Hernandez-Orallo

arXiv:1302.6937 [cs.LG] (Published 2013-02-27, updated 2014-06-10)

Online Convex Optimization Against Adversaries with Memory and Application to Statistical Arbitrage