Sparse Stochastic Bandits

Joon Kwon, Vianney Perchet, Claire Vernade
Proceedings of the 2017 Conference on Learning Theory, PMLR 65:1269-1270, 2017.

Abstract

In the classical multi-armed bandit problem, $d$ arms are available to the decision maker who pulls them sequentially in order to maximize his cumulative reward. Guarantees can be obtained on a relative quantity called regret, which scales linearly with $d$ (or with $\sqrt{d}$ in the minimax sense). We here consider the \emphsparse case of this classical problem in the sense that only a small number of arms, namely $s

Cite this Paper


BibTeX
@InProceedings{pmlr-v65-kwon17a, title = {Sparse Stochastic Bandits}, author = {Kwon, Joon and Perchet, Vianney and Vernade, Claire}, booktitle = {Proceedings of the 2017 Conference on Learning Theory}, pages = {1269--1270}, year = {2017}, editor = {Kale, Satyen and Shamir, Ohad}, volume = {65}, series = {Proceedings of Machine Learning Research}, month = {07--10 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v65/kwon17a/kwon17a.pdf}, url = {https://proceedings.mlr.press/v65/kwon17a.html}, abstract = {In the classical multi-armed bandit problem, $d$ arms are available to the decision maker who pulls them sequentially in order to maximize his cumulative reward. Guarantees can be obtained on a relative quantity called regret, which scales linearly with $d$ (or with $\sqrt{d}$ in the minimax sense). We here consider the \emphsparse case of this classical problem in the sense that only a small number of arms, namely $s
Endnote
%0 Conference Paper %T Sparse Stochastic Bandits %A Joon Kwon %A Vianney Perchet %A Claire Vernade %B Proceedings of the 2017 Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2017 %E Satyen Kale %E Ohad Shamir %F pmlr-v65-kwon17a %I PMLR %P 1269--1270 %U https://proceedings.mlr.press/v65/kwon17a.html %V 65 %X In the classical multi-armed bandit problem, $d$ arms are available to the decision maker who pulls them sequentially in order to maximize his cumulative reward. Guarantees can be obtained on a relative quantity called regret, which scales linearly with $d$ (or with $\sqrt{d}$ in the minimax sense). We here consider the \emphsparse case of this classical problem in the sense that only a small number of arms, namely $s
APA
Kwon, J., Perchet, V. & Vernade, C.. (2017). Sparse Stochastic Bandits. Proceedings of the 2017 Conference on Learning Theory, in Proceedings of Machine Learning Research 65:1269-1270 Available from https://proceedings.mlr.press/v65/kwon17a.html.

Related Material