arXiv:1803.00590 Abstract | arXiv Analytics

arXiv:1803.00590 [cs.LG]Abstract References Reviews Resources

Hierarchical Imitation and Reinforcement Learning

Hoang M. Le, Nan Jiang, Alekh Agarwal, Miroslav Dudík, Yisong Yue, Hal Daumé III

Published 2018-03-01Version 1

We study the problem of learning policies over long time horizons. We present a framework that leverages and integrates two key concepts. First, we utilize hierarchical policy classes that enable planning over different time scales, i.e., the high level planner proposes a sequence of subgoals for the low level planner to achieve. Second, we utilize expert demonstrations within the hierarchical action space to dramatically reduce cost of exploration. Our framework is flexible and can incorporate different combinations of imitation learning (IL) and reinforcement learning (RL) at different levels of the hierarchy. Using long-horizon benchmarks, including Montezuma's Revenge, we empirically demonstrate that our approach can learn significantly faster compared to hierarchical RL, and can be significantly more label- and sample-efficient compared to flat IL. We also provide theoretical analysis of the labeling cost for certain instantiations of our framework.

Categories: cs.LG, cs.AI, stat.ML

Keywords: reinforcement learning, hierarchical imitation, low level planner, high level planner, long time horizons

Related articles: Most relevant | Search more

arXiv:1203.3481 [cs.LG] (Published 2012-03-15)

Real-Time Scheduling via Reinforcement Learning

Robert Glaubius, Terry Tidwell, Christopher Gill, William D. Smart

arXiv:1809.10679 [cs.LG] (Published 2018-09-27)

Definition and evaluation of model-free coordination of electrical vehicle charging with reinforcement learning

Nasrin Sadeghianpourhamami, Johannes Deleu, Chris Develder

arXiv:1809.09095 [cs.LG] (Published 2018-09-23)

On Reinforcement Learning for Full-length Game of StarCraft

Zhen-Jia Pang, Ruo-Ze Liu, Zhou-Yu Meng, Yi Zhang, Yang Yu, Tong Lu