arXiv:2409.18909 Abstract | arXiv Analytics

arXiv:2409.18909 [cs.LG]Abstract References Reviews Resources

Best Arm Identification with Minimal Regret

Junwen Yang, Vincent Y. F. Tan, Tianyuan Jin

Published 2024-09-27Version 1

Motivated by real-world applications that necessitate responsible experimentation, we introduce the problem of best arm identification (BAI) with minimal regret. This innovative variant of the multi-armed bandit problem elegantly amalgamates two of its most ubiquitous objectives: regret minimization and BAI. More precisely, the agent's goal is to identify the best arm with a prescribed confidence level $\delta$, while minimizing the cumulative regret up to the stopping time. Focusing on single-parameter exponential families of distributions, we leverage information-theoretic techniques to establish an instance-dependent lower bound on the expected cumulative regret. Moreover, we present an intriguing impossibility result that underscores the tension between cumulative regret and sample complexity in fixed-confidence BAI. Complementarily, we design and analyze the Double KL-UCB algorithm, which achieves asymptotic optimality as the confidence level tends to zero. Notably, this algorithm employs two distinct confidence bounds to guide arm selection in a randomized manner. Our findings elucidate a fresh perspective on the inherent connections between regret minimization and BAI.

Comments: Preprint

Categories: cs.LG, cs.IT, math.IT, stat.ML

Keywords: best arm identification, minimal regret, cumulative regret, regret minimization, multi-armed bandit problem elegantly amalgamates

Related articles: Most relevant | Search more

arXiv:2110.08627 [cs.LG] (Published 2021-10-16, updated 2022-10-12)

Achieving the Pareto Frontier of Regret Minimization and Best Arm Identification in Multi-Armed Bandits

Zixin Zhong, Wang Chi Cheung, Vincent Y. F. Tan

arXiv:1810.04088 [cs.LG] (Published 2018-10-09)

Bridging the gap between regret minimization and best arm identification, with application to A/B tests

Rémy Degenne, Thomas Nedelec, Clément Calauzènes, Vianney Perchet

arXiv:2403.09123 [cs.LG] (Published 2024-03-14)

Optimal Top-Two Method for Best Arm Identification and Fluid Analysis