Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit

Alexandra Carpentier, Remi Munos
; Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:190-198, 2012.

Abstract

We consider a linear stochastic bandit problem where the dimension K of the unknown parameter heta is larger than the sampling budget n. Since usual linear bandit algorithms have a regret in O(K\sqrtn), it is in general impossible to obtain a sub-linear regret without further assumption. In this paper we make the assumption that heta is S-sparse, i.e. has at most S-non-zero components, and that the space of arms is the unit ball for the ||.||_2 norm. We combine ideas from Compressed Sensing and Bandit Theory to derive an algorithm with a regret bound in O(S\sqrtn). We detail an application to the problem of optimizing a function that depends on many variables but among which only a small number of them (initially unknown) are relevant.

Cite this Paper


BibTeX
@InProceedings{pmlr-v22-carpentier12, title = {Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit}, author = {Alexandra Carpentier and Remi Munos}, booktitle = {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics}, pages = {190--198}, year = {2012}, editor = {Neil D. Lawrence and Mark Girolami}, volume = {22}, series = {Proceedings of Machine Learning Research}, address = {La Palma, Canary Islands}, month = {21--23 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v22/carpentier12/carpentier12.pdf}, url = {http://proceedings.mlr.press/v22/carpentier12.html}, abstract = {We consider a linear stochastic bandit problem where the dimension K of the unknown parameter heta is larger than the sampling budget n. Since usual linear bandit algorithms have a regret in O(K\sqrtn), it is in general impossible to obtain a sub-linear regret without further assumption. In this paper we make the assumption that heta is S-sparse, i.e. has at most S-non-zero components, and that the space of arms is the unit ball for the ||.||_2 norm. We combine ideas from Compressed Sensing and Bandit Theory to derive an algorithm with a regret bound in O(S\sqrtn). We detail an application to the problem of optimizing a function that depends on many variables but among which only a small number of them (initially unknown) are relevant.} }
Endnote
%0 Conference Paper %T Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit %A Alexandra Carpentier %A Remi Munos %B Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2012 %E Neil D. Lawrence %E Mark Girolami %F pmlr-v22-carpentier12 %I PMLR %J Proceedings of Machine Learning Research %P 190--198 %U http://proceedings.mlr.press %V 22 %W PMLR %X We consider a linear stochastic bandit problem where the dimension K of the unknown parameter heta is larger than the sampling budget n. Since usual linear bandit algorithms have a regret in O(K\sqrtn), it is in general impossible to obtain a sub-linear regret without further assumption. In this paper we make the assumption that heta is S-sparse, i.e. has at most S-non-zero components, and that the space of arms is the unit ball for the ||.||_2 norm. We combine ideas from Compressed Sensing and Bandit Theory to derive an algorithm with a regret bound in O(S\sqrtn). We detail an application to the problem of optimizing a function that depends on many variables but among which only a small number of them (initially unknown) are relevant.
RIS
TY - CPAPER TI - Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit AU - Alexandra Carpentier AU - Remi Munos BT - Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics PY - 2012/03/21 DA - 2012/03/21 ED - Neil D. Lawrence ED - Mark Girolami ID - pmlr-v22-carpentier12 PB - PMLR SP - 190 DP - PMLR EP - 198 L1 - http://proceedings.mlr.press/v22/carpentier12/carpentier12.pdf UR - http://proceedings.mlr.press/v22/carpentier12.html AB - We consider a linear stochastic bandit problem where the dimension K of the unknown parameter heta is larger than the sampling budget n. Since usual linear bandit algorithms have a regret in O(K\sqrtn), it is in general impossible to obtain a sub-linear regret without further assumption. In this paper we make the assumption that heta is S-sparse, i.e. has at most S-non-zero components, and that the space of arms is the unit ball for the ||.||_2 norm. We combine ideas from Compressed Sensing and Bandit Theory to derive an algorithm with a regret bound in O(S\sqrtn). We detail an application to the problem of optimizing a function that depends on many variables but among which only a small number of them (initially unknown) are relevant. ER -
APA
Carpentier, A. & Munos, R.. (2012). Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, in PMLR 22:190-198

Related Material