Structured Best Arm Identification with Fixed Confidence

Ruitong Huang, Mohammad M. Ajallooeian, Csaba Szepesvári, Martin Müller
Proceedings of the 28th International Conference on Algorithmic Learning Theory, PMLR 76:593-616, 2017.

Abstract

We study the problem of identifying the best action among a set of possible options when the value of each action is given by a mapping from a number of noisy micro-observables in the so-called fixed confidence setting. Our main motivation is the application to minimax game search, which has been a major topic of interest in artificial intelligence. In this paper we introduce an abstract setting to clearly describe the essential properties of the problem. While previous work only considered a two-move-deep game tree search problem, our abstract setting can be applied to the general minimax games where the depth can be non-uniform and arbitrary, and transpositions are allowed. We introduce a new algorithm (LUCB-micro) for the abstract setting, and give its lower and upper sample complexity results. Our bounds recover some previous results, achieved in more limited settings, and also shed further light on how the structure of minimax problems influences sample complexity.

Cite this Paper


BibTeX
@InProceedings{pmlr-v76-huang17a, title = {Structured Best Arm Identification with Fixed Confidence}, author = {Huang, Ruitong and Ajallooeian, Mohammad M. and Szepesvári, Csaba and Müller, Martin}, booktitle = {Proceedings of the 28th International Conference on Algorithmic Learning Theory}, pages = {593--616}, year = {2017}, editor = {Hanneke, Steve and Reyzin, Lev}, volume = {76}, series = {Proceedings of Machine Learning Research}, month = {15--17 Oct}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v76/huang17a/huang17a.pdf}, url = {https://proceedings.mlr.press/v76/huang17a.html}, abstract = {We study the problem of identifying the best action among a set of possible options when the value of each action is given by a mapping from a number of noisy micro-observables in the so-called fixed confidence setting. Our main motivation is the application to minimax game search, which has been a major topic of interest in artificial intelligence. In this paper we introduce an abstract setting to clearly describe the essential properties of the problem. While previous work only considered a two-move-deep game tree search problem, our abstract setting can be applied to the general minimax games where the depth can be non-uniform and arbitrary, and transpositions are allowed. We introduce a new algorithm (LUCB-micro) for the abstract setting, and give its lower and upper sample complexity results. Our bounds recover some previous results, achieved in more limited settings, and also shed further light on how the structure of minimax problems influences sample complexity.} }
Endnote
%0 Conference Paper %T Structured Best Arm Identification with Fixed Confidence %A Ruitong Huang %A Mohammad M. Ajallooeian %A Csaba Szepesvári %A Martin Müller %B Proceedings of the 28th International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2017 %E Steve Hanneke %E Lev Reyzin %F pmlr-v76-huang17a %I PMLR %P 593--616 %U https://proceedings.mlr.press/v76/huang17a.html %V 76 %X We study the problem of identifying the best action among a set of possible options when the value of each action is given by a mapping from a number of noisy micro-observables in the so-called fixed confidence setting. Our main motivation is the application to minimax game search, which has been a major topic of interest in artificial intelligence. In this paper we introduce an abstract setting to clearly describe the essential properties of the problem. While previous work only considered a two-move-deep game tree search problem, our abstract setting can be applied to the general minimax games where the depth can be non-uniform and arbitrary, and transpositions are allowed. We introduce a new algorithm (LUCB-micro) for the abstract setting, and give its lower and upper sample complexity results. Our bounds recover some previous results, achieved in more limited settings, and also shed further light on how the structure of minimax problems influences sample complexity.
APA
Huang, R., Ajallooeian, M.M., Szepesvári, C. & Müller, M.. (2017). Structured Best Arm Identification with Fixed Confidence. Proceedings of the 28th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 76:593-616 Available from https://proceedings.mlr.press/v76/huang17a.html.

Related Material