arXiv:2106.06483 Abstract | arXiv Analytics

arXiv:2106.06483 [cs.LG]Abstract References Reviews Resources

Optimal Model Selection in Contextual Bandits with Many Classes via Offline Oracles

Published 2021-06-11Version 1

We study the problem of model selection for contextual bandits, in which the algorithm must balance the bias-variance trade-off for model estimation while also balancing the exploration-exploitation trade-off. In this paper, we propose the first reduction of model selection in contextual bandits to offline model selection oracles, allowing for flexible general purpose algorithms with computational requirements no worse than those for model selection for regression. Our main result is a new model selection guarantee for stochastic contextual bandits. When one of the classes in our set is realizable, up to a logarithmic dependency on the number of classes, our algorithm attains optimal realizability-based regret bounds for that class under one of two conditions: if the time-horizon is large enough, or if an assumption that helps with detecting misspecification holds. Hence our algorithm adapts to the complexity of this unknown class. Even when this realizable class is known, we prove improved regret guarantees in early rounds by relying on simpler model classes for those rounds and hence further establish the importance of model selection in contextual bandits.

Categories: cs.LG, stat.ML

Keywords: contextual bandits, optimal model selection, offline oracles, attains optimal realizability-based regret bounds, offline model selection oracles

Related articles: Most relevant | Search more

arXiv:2110.03177 [cs.LG] (Published 2021-10-07, updated 2022-02-11)

EE-Net: Exploitation-Exploration Neural Networks in Contextual Bandits

Yikun Ban, Yuchen Yan, Arindam Banerjee, Jingrui He

arXiv:1903.08600 [cs.LG] (Published 2019-03-20)

Contextual Bandits with Random Projection

Xiaotian Yu

arXiv:2206.00314 [cs.LG] (Published 2022-06-01)

Contextual Bandits with Knapsacks for a Conversion Model

Zhen Li, Gilles Stoltz

arXiv Analytics

arXiv:2106.06483 [cs.LG]Abstract References Reviews Resources

Optimal Model Selection in Contextual Bandits with Many Classes via Offline Oracles

Links

Toolbox

arXiv:2106.06483 [cs.LG]AbstractReferencesReviewsResources

Optimal Model Selection in Contextual Bandits with Many Classes via Offline Oracles

Links

Toolbox

arXiv:2106.06483 [cs.LG]Abstract References Reviews Resources