Exploring $k$ out of Top $ρ$ Fraction of Arms in Stochastic Bandits

Wenbo Ren, Jia Liu, Ness B. Shroff
Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:2820-2828, 2019.

Abstract

This paper studies the problem of identifying any $k$ distinct arms among the top $\rho$ fraction (e.g., top 5%) of arms from a finite or infinite set with a probably approximately correct (PAC) tolerance $\epsilon$. We consider two cases: (i) when the threshold of the top arms’ expected rewards is known and (ii) when it is unknown. We prove lower bounds for the four variants (finite or infinite arms, and known or unknown threshold), and propose algorithms for each. Two of these algorithms are shown to be sample complexity optimal (up to constant factors) and the other two are optimal up to a log factor. Results in this paper provide up to $\rho n/k$ reductions compared with the “$k$-exploration” algorithms that focus on finding the (PAC) best $k$ arms out of $n$ arms. We also numerically show improvements over the state-of-the-art.

Cite this Paper


BibTeX
@InProceedings{pmlr-v89-ren19a, title = {Exploring $k$ out of Top $ρ$ Fraction of Arms in Stochastic Bandits}, author = {Ren, Wenbo and Liu, Jia and Shroff, Ness B.}, booktitle = {Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics}, pages = {2820--2828}, year = {2019}, editor = {Chaudhuri, Kamalika and Sugiyama, Masashi}, volume = {89}, series = {Proceedings of Machine Learning Research}, month = {16--18 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v89/ren19a/ren19a.pdf}, url = {https://proceedings.mlr.press/v89/ren19a.html}, abstract = {This paper studies the problem of identifying any $k$ distinct arms among the top $\rho$ fraction (e.g., top 5%) of arms from a finite or infinite set with a probably approximately correct (PAC) tolerance $\epsilon$. We consider two cases: (i) when the threshold of the top arms’ expected rewards is known and (ii) when it is unknown. We prove lower bounds for the four variants (finite or infinite arms, and known or unknown threshold), and propose algorithms for each. Two of these algorithms are shown to be sample complexity optimal (up to constant factors) and the other two are optimal up to a log factor. Results in this paper provide up to $\rho n/k$ reductions compared with the “$k$-exploration” algorithms that focus on finding the (PAC) best $k$ arms out of $n$ arms. We also numerically show improvements over the state-of-the-art.} }
Endnote
%0 Conference Paper %T Exploring $k$ out of Top $ρ$ Fraction of Arms in Stochastic Bandits %A Wenbo Ren %A Jia Liu %A Ness B. Shroff %B Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Masashi Sugiyama %F pmlr-v89-ren19a %I PMLR %P 2820--2828 %U https://proceedings.mlr.press/v89/ren19a.html %V 89 %X This paper studies the problem of identifying any $k$ distinct arms among the top $\rho$ fraction (e.g., top 5%) of arms from a finite or infinite set with a probably approximately correct (PAC) tolerance $\epsilon$. We consider two cases: (i) when the threshold of the top arms’ expected rewards is known and (ii) when it is unknown. We prove lower bounds for the four variants (finite or infinite arms, and known or unknown threshold), and propose algorithms for each. Two of these algorithms are shown to be sample complexity optimal (up to constant factors) and the other two are optimal up to a log factor. Results in this paper provide up to $\rho n/k$ reductions compared with the “$k$-exploration” algorithms that focus on finding the (PAC) best $k$ arms out of $n$ arms. We also numerically show improvements over the state-of-the-art.
APA
Ren, W., Liu, J. & Shroff, N.B.. (2019). Exploring $k$ out of Top $ρ$ Fraction of Arms in Stochastic Bandits. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 89:2820-2828 Available from https://proceedings.mlr.press/v89/ren19a.html.

Related Material