A Non-asymptotic Approach to Best-Arm Identification for Gaussian Bandits

Antoine Barrier, Aurélien Garivier, Tomáš Kocák
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:10078-10109, 2022.

Abstract

We propose a new strategy for best-arm identification with fixed confidence of Gaussian variables with bounded means and unit variance. This strategy, called Exploration-Biased Sampling, is not only asymptotically optimal: it is to the best of our knowledge the first strategy with non-asymptotic bounds that asymptotically matches the sample complexity. But the main advantage over other algorithms like Track-and-Stop is an improved behavior regarding exploration: Exploration-Biased Sampling is biased towards exploration in a subtle but natural way that makes it more stable and interpretable. These improvements are allowed by a new analysis of the sample complexity optimization problem, which yields a faster numerical resolution scheme and several quantitative regularity results that we believe of high independent interest.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-barrier22a, title = { A Non-asymptotic Approach to Best-Arm Identification for Gaussian Bandits }, author = {Barrier, Antoine and Garivier, Aur\'elien and Koc\'ak, Tom\'a\v{s}}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {10078--10109}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/barrier22a/barrier22a.pdf}, url = {https://proceedings.mlr.press/v151/barrier22a.html}, abstract = { We propose a new strategy for best-arm identification with fixed confidence of Gaussian variables with bounded means and unit variance. This strategy, called Exploration-Biased Sampling, is not only asymptotically optimal: it is to the best of our knowledge the first strategy with non-asymptotic bounds that asymptotically matches the sample complexity. But the main advantage over other algorithms like Track-and-Stop is an improved behavior regarding exploration: Exploration-Biased Sampling is biased towards exploration in a subtle but natural way that makes it more stable and interpretable. These improvements are allowed by a new analysis of the sample complexity optimization problem, which yields a faster numerical resolution scheme and several quantitative regularity results that we believe of high independent interest. } }
Endnote
%0 Conference Paper %T A Non-asymptotic Approach to Best-Arm Identification for Gaussian Bandits %A Antoine Barrier %A Aurélien Garivier %A Tomáš Kocák %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-barrier22a %I PMLR %P 10078--10109 %U https://proceedings.mlr.press/v151/barrier22a.html %V 151 %X We propose a new strategy for best-arm identification with fixed confidence of Gaussian variables with bounded means and unit variance. This strategy, called Exploration-Biased Sampling, is not only asymptotically optimal: it is to the best of our knowledge the first strategy with non-asymptotic bounds that asymptotically matches the sample complexity. But the main advantage over other algorithms like Track-and-Stop is an improved behavior regarding exploration: Exploration-Biased Sampling is biased towards exploration in a subtle but natural way that makes it more stable and interpretable. These improvements are allowed by a new analysis of the sample complexity optimization problem, which yields a faster numerical resolution scheme and several quantitative regularity results that we believe of high independent interest.
APA
Barrier, A., Garivier, A. & Kocák, T.. (2022). A Non-asymptotic Approach to Best-Arm Identification for Gaussian Bandits . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:10078-10109 Available from https://proceedings.mlr.press/v151/barrier22a.html.

Related Material