Improved Learning Complexity in Combinatorial Pure Exploration Bandits

Victor Gabillon; Alessandro Lazaric; Mohammad Ghavamzadeh; Ronald Ortner; Peter Bartlett

Improved Learning Complexity in Combinatorial Pure Exploration Bandits

Victor Gabillon, Alessandro Lazaric, Mohammad Ghavamzadeh, Ronald Ortner, Peter Bartlett

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:1004-1012, 2016.

Abstract

We study the problem of combinatorial pure exploration in the stochastic multi-armed bandit problem. We first construct a new measure of complexity that provably characterizes the learning performance of the algorithms we propose for the fixed confidence and the fixed budget setting. We show that this complexity is never higher than the one in existing work and illustrate a number of configurations in which it can be significantly smaller. While in general this improvement comes at the cost of increased computational complexity, we provide a series of examples, including a planning problem, where this extra cost is not significant.

Cite this Paper

BibTeX


@InProceedings{pmlr-v51-gabillon16,
  title = 	 {Improved Learning Complexity in Combinatorial Pure Exploration Bandits},
  author = 	 {Gabillon, Victor and Lazaric, Alessandro and Ghavamzadeh, Mohammad and Ortner, Ronald and Bartlett, Peter},
  booktitle = 	 {Proceedings of the 19th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1004--1012},
  year = 	 {2016},
  editor = 	 {Gretton, Arthur and Robert, Christian C.},
  volume = 	 {51},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Cadiz, Spain},
  month = 	 {09--11 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v51/gabillon16.pdf},
  url = 	 {https://proceedings.mlr.press/v51/gabillon16.html},
  abstract = 	 {We study the problem of combinatorial pure exploration in the stochastic multi-armed bandit problem. We first construct a new measure of complexity that provably characterizes the learning performance of the algorithms we propose for the fixed confidence and the fixed budget setting. We show that this complexity is never higher than the one in existing work and illustrate a number of configurations in which it can be significantly smaller. While in general this improvement comes at the cost of increased computational complexity, we provide a series of examples, including a planning problem, where this extra cost is not significant.}
}

Endnote

%0 Conference Paper
%T Improved Learning Complexity in Combinatorial Pure Exploration Bandits
%A Victor Gabillon
%A Alessandro Lazaric
%A Mohammad Ghavamzadeh
%A Ronald Ortner
%A Peter Bartlett
%B Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2016
%E Arthur Gretton
%E Christian C. Robert	
%F pmlr-v51-gabillon16
%I PMLR
%P 1004--1012
%U https://proceedings.mlr.press/v51/gabillon16.html
%V 51
%X We study the problem of combinatorial pure exploration in the stochastic multi-armed bandit problem. We first construct a new measure of complexity that provably characterizes the learning performance of the algorithms we propose for the fixed confidence and the fixed budget setting. We show that this complexity is never higher than the one in existing work and illustrate a number of configurations in which it can be significantly smaller. While in general this improvement comes at the cost of increased computational complexity, we provide a series of examples, including a planning problem, where this extra cost is not significant.

RIS


TY  - CPAPER
TI  - Improved Learning Complexity in Combinatorial Pure Exploration Bandits
AU  - Victor Gabillon
AU  - Alessandro Lazaric
AU  - Mohammad Ghavamzadeh
AU  - Ronald Ortner
AU  - Peter Bartlett
BT  - Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
DA  - 2016/05/02
ED  - Arthur Gretton
ED  - Christian C. Robert	
ID  - pmlr-v51-gabillon16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 51
SP  - 1004
EP  - 1012
L1  - http://proceedings.mlr.press/v51/gabillon16.pdf
UR  - https://proceedings.mlr.press/v51/gabillon16.html
AB  - We study the problem of combinatorial pure exploration in the stochastic multi-armed bandit problem. We first construct a new measure of complexity that provably characterizes the learning performance of the algorithms we propose for the fixed confidence and the fixed budget setting. We show that this complexity is never higher than the one in existing work and illustrate a number of configurations in which it can be significantly smaller. While in general this improvement comes at the cost of increased computational complexity, we provide a series of examples, including a planning problem, where this extra cost is not significant.
ER  -

APA


Gabillon, V., Lazaric, A., Ghavamzadeh, M., Ortner, R. & Bartlett, P.. (2016). Improved Learning Complexity in Combinatorial Pure Exploration Bandits. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 51:1004-1012 Available from https://proceedings.mlr.press/v51/gabillon16.html.

Improved Learning Complexity in Combinatorial Pure Exploration Bandits

Abstract

Cite this Paper

Related Material