arXiv Analytics

Sign in

arXiv:1502.00598 [cs.LG]AbstractReferencesReviewsResources

Thompson sampling with the online bootstrap

Maurits Kaptein, Davide Ianuzzi

Published 2015-02-02Version 1

We often encounter situations in which an experimenter wants to find, by sequential experimentation, $x_{max} = \arg\max_{x} f(x)$, where $f(x)$ is a (possibly unknown) function of a well controllable variable $x$. Taking inspiration from physics and engineering, we have designed a new method to address this problem. In this paper, we first introduce the method in continuous time, and then present two algorithms for use in sequential experiments. Through a series of simulation studies, we show that the method is effective for finding maxima of unknown functions by experimentation, even when the maximum of the functions drifts or when the signal to noise ratio is low.

Related articles: Most relevant | Search more
arXiv:1410.4009 [cs.LG] (Published 2014-10-15)
Thompson sampling with the online bootstrap
arXiv:1803.04623 [cs.LG] (Published 2018-03-13)
Thompson Sampling for Combinatorial Semi-Bandits
arXiv:1209.3352 [cs.LG] (Published 2012-09-15, updated 2014-02-03)
Thompson Sampling for Contextual Bandits with Linear Payoffs