arXiv:2112.10751 Abstract | arXiv Analytics

arXiv:2112.10751 [cs.LG]Abstract References Reviews Resources

RvS: What is Essential for Offline RL via Supervised Learning?

Scott Emmons, Benjamin Eysenbach, Ilya Kostrikov, Sergey Levine

Published 2021-12-20, updated 2022-05-11Version 2

Recent work has shown that supervised learning alone, without temporal difference (TD) learning, can be remarkably effective for offline RL. When does this hold true, and which algorithmic components are necessary? Through extensive experiments, we boil supervised learning for offline RL down to its essential elements. In every environment suite we consider, simply maximizing likelihood with a two-layer feedforward MLP is competitive with state-of-the-art results of substantially more complex methods based on TD learning or sequence modeling with Transformers. Carefully choosing model capacity (e.g., via regularization or architecture) and choosing which information to condition on (e.g., goals or rewards) are critical for performance. These insights serve as a field guide for practitioners doing Reinforcement Learning via Supervised Learning (which we coin "RvS learning"). They also probe the limits of existing RvS methods, which are comparatively weak on random data, and suggest a number of open problems.

Categories: cs.LG, cs.AI, stat.ML

Keywords: offline rl, supervised learning, open problems, random data, temporal difference

Related articles: Most relevant | Search more

arXiv:2203.01387 [cs.LG] (Published 2022-03-02)

A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems

Rafael Figueiredo Prudencio, Marcos R. O. A. Maximo, Esther Luna Colombini

arXiv:2005.01643 [cs.LG] (Published 2020-05-04)

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems

Sergey Levine, Aviral Kumar, George Tucker, Justin Fu

arXiv:1310.5042 [cs.LG] (Published 2013-10-18)

Distributional semantics beyond words: Supervised learning of analogy and paraphrase

Peter D. Turney

arXiv Analytics

arXiv:2112.10751 [cs.LG]Abstract References Reviews Resources

RvS: What is Essential for Offline RL via Supervised Learning?

Links

Toolbox

arXiv:2112.10751 [cs.LG]AbstractReferencesReviewsResources

RvS: What is Essential for Offline RL via Supervised Learning?

Links

Toolbox

arXiv:2112.10751 [cs.LG]Abstract References Reviews Resources