arXiv Analytics

Sign in

arXiv:1705.08439 [cs.AI]AbstractReferencesReviewsResources

Thinking Fast and Slow with Deep Learning and Tree Search

Thomas Anthony, Zheng Tian, David Barber

Published 2017-05-23Version 1

Solving sequential decision making problems, such as text parsing, robotic control, and game playing, requires a combination of planning policies and generalisation of those plans. In this paper, we present Expert Iteration, a novel algorithm which decomposes the problem into separate planning and generalisation tasks. Planning new policies is performed by tree search, while a deep neural network generalises those plans. In contrast, standard Deep Reinforcement Learning algorithms rely on a neural network not only to generalise plans, but to discover them too. We show that our method substantially outperforms Policy Gradients in the board game Hex, winning 84.4% of games against it when trained for equal time.

Related articles: Most relevant | Search more
arXiv:2112.11947 [cs.AI] (Published 2021-12-22, updated 2022-05-27)
Evaluating the Robustness of Deep Reinforcement Learning for Autonomous and Adversarial Policies in a Multi-agent Urban Driving Environment
arXiv:2106.12207 [cs.AI] (Published 2021-06-23)
Not all users are the same: Providing personalized explanations for sequential decision making problems
arXiv:2010.06002 [cs.AI] (Published 2020-10-12)
Thinking Fast and Slow in AI
Grady Booch et al.