Multiple Identifications in Multi-Armed Bandits

Séebastian Bubeck, Tengyao Wang, Nitin Viswanathan
Proceedings of the 30th International Conference on Machine Learning, PMLR 28(1):258-265, 2013.

Abstract

We study the problem of identifying the top m arms in a multi-armed bandit game. Our proposed solution relies on a new algorithm based on successive rejects of the seemingly bad arms, and successive accepts of the good ones. This algorithmic contribution allows to tackle other multiple identifications settings that were previously out of reach. In particular we show that this idea of successive accepts and rejects applies to the multi-bandit best arm identification problem.

Cite this Paper


BibTeX
@InProceedings{pmlr-v28-bubeck13, title = {Multiple Identifications in Multi-Armed Bandits}, author = {Bubeck, Séebastian and Wang, Tengyao and Viswanathan, Nitin}, booktitle = {Proceedings of the 30th International Conference on Machine Learning}, pages = {258--265}, year = {2013}, editor = {Dasgupta, Sanjoy and McAllester, David}, volume = {28}, number = {1}, series = {Proceedings of Machine Learning Research}, address = {Atlanta, Georgia, USA}, month = {17--19 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v28/bubeck13.pdf}, url = {https://proceedings.mlr.press/v28/bubeck13.html}, abstract = {We study the problem of identifying the top m arms in a multi-armed bandit game. Our proposed solution relies on a new algorithm based on successive rejects of the seemingly bad arms, and successive accepts of the good ones. This algorithmic contribution allows to tackle other multiple identifications settings that were previously out of reach. In particular we show that this idea of successive accepts and rejects applies to the multi-bandit best arm identification problem.} }
Endnote
%0 Conference Paper %T Multiple Identifications in Multi-Armed Bandits %A Séebastian Bubeck %A Tengyao Wang %A Nitin Viswanathan %B Proceedings of the 30th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2013 %E Sanjoy Dasgupta %E David McAllester %F pmlr-v28-bubeck13 %I PMLR %P 258--265 %U https://proceedings.mlr.press/v28/bubeck13.html %V 28 %N 1 %X We study the problem of identifying the top m arms in a multi-armed bandit game. Our proposed solution relies on a new algorithm based on successive rejects of the seemingly bad arms, and successive accepts of the good ones. This algorithmic contribution allows to tackle other multiple identifications settings that were previously out of reach. In particular we show that this idea of successive accepts and rejects applies to the multi-bandit best arm identification problem.
RIS
TY - CPAPER TI - Multiple Identifications in Multi-Armed Bandits AU - Séebastian Bubeck AU - Tengyao Wang AU - Nitin Viswanathan BT - Proceedings of the 30th International Conference on Machine Learning DA - 2013/02/13 ED - Sanjoy Dasgupta ED - David McAllester ID - pmlr-v28-bubeck13 PB - PMLR DP - Proceedings of Machine Learning Research VL - 28 IS - 1 SP - 258 EP - 265 L1 - http://proceedings.mlr.press/v28/bubeck13.pdf UR - https://proceedings.mlr.press/v28/bubeck13.html AB - We study the problem of identifying the top m arms in a multi-armed bandit game. Our proposed solution relies on a new algorithm based on successive rejects of the seemingly bad arms, and successive accepts of the good ones. This algorithmic contribution allows to tackle other multiple identifications settings that were previously out of reach. In particular we show that this idea of successive accepts and rejects applies to the multi-bandit best arm identification problem. ER -
APA
Bubeck, S., Wang, T. & Viswanathan, N.. (2013). Multiple Identifications in Multi-Armed Bandits. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(1):258-265 Available from https://proceedings.mlr.press/v28/bubeck13.html.

Related Material