arXiv:2302.06692 Abstract | arXiv Analytics

arXiv:2302.06692 [cs.LG]Abstract References Reviews Resources

Guiding Pretraining in Reinforcement Learning with Large Language Models

Yuqing Du, Olivia Watkins, Zihan Wang, Cédric Colas, Trevor Darrell, Pieter Abbeel, Abhishek Gupta, Jacob Andreas

Published 2023-02-13Version 1

Reinforcement learning algorithms typically struggle in the absence of a dense, well-shaped reward function. Intrinsically motivated exploration methods address this limitation by rewarding agents for visiting novel states or transitions, but these methods offer limited benefits in large environments where most discovered novelty is irrelevant for downstream tasks. We describe a method that uses background knowledge from text corpora to shape exploration. This method, called ELLM (Exploring with LLMs) rewards an agent for achieving goals suggested by a language model prompted with a description of the agent's current state. By leveraging large-scale language model pretraining, ELLM guides agents toward human-meaningful and plausibly useful behaviors without requiring a human in the loop. We evaluate ELLM in the Crafter game environment and the Housekeep robotic simulator, showing that ELLM-trained agents have better coverage of common-sense behaviors during pretraining and usually match or improve performance on a range of downstream tasks.

Categories: cs.LG, cs.AI, cs.CL

Keywords: large language models, reinforcement learning, motivated exploration methods address, learning algorithms typically struggle, large-scale language model pretraining

Related articles: Most relevant | Search more

arXiv:2309.10668 [cs.LG] (Published 2023-09-19)

Language Modeling Is Compression

Grégoire Delétang et al.

arXiv:2309.02784 [cs.LG] (Published 2023-09-06)

Norm Tweaking: High-performance Low-bit Quantization of Large Language Models

Liang Li, Qingyuan Li, Bo Zhang, Xiangxiang Chu

arXiv:2211.01910 [cs.LG] (Published 2022-11-03)

Large Language Models Are Human-Level Prompt Engineers

Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, Jimmy Ba

arXiv Analytics

arXiv:2302.06692 [cs.LG]Abstract References Reviews Resources

Guiding Pretraining in Reinforcement Learning with Large Language Models

Links

Toolbox

arXiv:2302.06692 [cs.LG]AbstractReferencesReviewsResources

Guiding Pretraining in Reinforcement Learning with Large Language Models

Links

Toolbox

arXiv:2302.06692 [cs.LG]Abstract References Reviews Resources