arXiv:2210.11834 Abstract | arXiv Analytics

arXiv:2210.11834 [cs.LG]Abstract References Reviews Resources

Optimal Contextual Bandits with Knapsacks under Realizibility via Regression Oracles

Yuxuan Han, Jialin Zeng, Yang Wang, Yang Xiang, Jiheng Zhang

Published 2022-10-21Version 1

We study the stochastic contextual bandit with knapsacks (CBwK) problem, where each action, taken upon a context, not only leads to a random reward but also costs a random resource consumption in a vector form. The challenge is to maximize the total reward without violating the budget for each resource. We study this problem under a general realizability setting where the expected reward and expected cost are functions of contexts and actions in some given general function classes $\mathcal{F}$ and $\mathcal{G}$, respectively. Existing works on CBwK are restricted to the linear function class since they use UCB-type algorithms, which heavily rely on the linear form and thus are difficult to extend to general function classes. Motivated by online regression oracles that have been successfully applied to contextual bandits, we propose the first universal and optimal algorithmic framework for CBwK by reducing it to online regression. We also establish the lower regret bound to show the optimality of our algorithm for a variety of function classes.

Categories: cs.LG, stat.ML

Keywords: optimal contextual bandits, general function classes, realizibility, linear function class, random resource consumption

Related articles: Most relevant | Search more

arXiv:2202.13603 [cs.LG] (Published 2022-02-28)

Bandit Learning with General Function Classes: Heteroscedastic Noise and Variance-dependent Regret Bounds

Heyang Zhao, Dongruo Zhou, Jiafan He, Quanquan Gu

arXiv:2202.11091 [cs.LG] (Published 2022-02-22)

Efficient and Differentiable Conformal Prediction with General Function Classes

Yu Bai, Song Mei, Huan Wang, Yingbo Zhou, Caiming Xiong

arXiv:1905.10506 [cs.LG] (Published 2019-05-25)

A Kernel Loss for Solving the Bellman Equation

Yihao Feng, Lihong Li, Qiang Liu