Pure Exploration of Multi-armed Bandit Under Matroid Constraints

Lijie Chen; Anupam Gupta; Jian Li

Pure Exploration of Multi-armed Bandit Under Matroid Constraints

Lijie Chen, Anupam Gupta, Jian Li

29th Annual Conference on Learning Theory, PMLR 49:647-669, 2016.

Abstract

We study the pure exploration problem subject to a matroid constraint (Best-Basis) in a stochastic multi-armed bandit game. In a Best-Basis instance, we are given n stochastic arms with unknown reward distributions, as well as a matroid \mathcalM over the arms. Let the weight of an arm be the mean of its reward distribution. Our goal is to identify a basis of \mathcalM with the maximum total weight, using as few samples as possible. The problem is a significant generalization of the best arm identification problem and the top-k arm identification problem, which have attracted significant attentions in recent years. We study both the exact and PAC versions of Best-Basis, and provide algorithms with nearly-optimal sample complexities for these versions. Our results generalize and/or improve on several previous results for the top-k arm identification problem and the combinatorial pure exploration problem when the combinatorial constraint is a matroid.

Cite this Paper

BibTeX


@InProceedings{pmlr-v49-chen16a,
  title = 	 {Pure Exploration of Multi-armed Bandit Under Matroid Constraints},
  author = 	 {Chen, Lijie and Gupta, Anupam and Li, Jian},
  booktitle = 	 {29th Annual Conference on Learning Theory},
  pages = 	 {647--669},
  year = 	 {2016},
  editor = 	 {Feldman, Vitaly and Rakhlin, Alexander and Shamir, Ohad},
  volume = 	 {49},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Columbia University, New York, New York, USA},
  month = 	 {23--26 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v49/chen16a.pdf},
  url = 	 {https://proceedings.mlr.press/v49/chen16a.html},
  abstract = 	 {We study the pure exploration problem subject to a matroid constraint	(Best-Basis) in a stochastic multi-armed bandit game. In a Best-Basis instance, we are given n stochastic arms with unknown reward distributions, as well as a matroid \mathcalM over the arms. Let the weight of an arm be the mean of its reward distribution. Our goal is to identify a basis of \mathcalM with the maximum total weight, using as few samples as possible. The problem is a significant generalization of the best arm identification problem and the top-k arm identification problem, which have attracted significant attentions in recent years. We study both the exact and PAC versions of Best-Basis, and provide algorithms with nearly-optimal sample complexities for these versions. Our results generalize and/or improve on several previous results for the top-k arm identification problem and the combinatorial pure exploration problem when the combinatorial constraint is a matroid.}
}

Endnote

%0 Conference Paper
%T Pure Exploration of Multi-armed Bandit Under Matroid Constraints
%A Lijie Chen
%A Anupam Gupta
%A Jian Li
%B 29th Annual Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2016
%E Vitaly Feldman
%E Alexander Rakhlin
%E Ohad Shamir	
%F pmlr-v49-chen16a
%I PMLR
%P 647--669
%U https://proceedings.mlr.press/v49/chen16a.html
%V 49
%X We study the pure exploration problem subject to a matroid constraint	(Best-Basis) in a stochastic multi-armed bandit game. In a Best-Basis instance, we are given n stochastic arms with unknown reward distributions, as well as a matroid \mathcalM over the arms. Let the weight of an arm be the mean of its reward distribution. Our goal is to identify a basis of \mathcalM with the maximum total weight, using as few samples as possible. The problem is a significant generalization of the best arm identification problem and the top-k arm identification problem, which have attracted significant attentions in recent years. We study both the exact and PAC versions of Best-Basis, and provide algorithms with nearly-optimal sample complexities for these versions. Our results generalize and/or improve on several previous results for the top-k arm identification problem and the combinatorial pure exploration problem when the combinatorial constraint is a matroid.

RIS


TY  - CPAPER
TI  - Pure Exploration of Multi-armed Bandit Under Matroid Constraints
AU  - Lijie Chen
AU  - Anupam Gupta
AU  - Jian Li
BT  - 29th Annual Conference on Learning Theory
DA  - 2016/06/06
ED  - Vitaly Feldman
ED  - Alexander Rakhlin
ED  - Ohad Shamir	
ID  - pmlr-v49-chen16a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 49
SP  - 647
EP  - 669
L1  - http://proceedings.mlr.press/v49/chen16a.pdf
UR  - https://proceedings.mlr.press/v49/chen16a.html
AB  - We study the pure exploration problem subject to a matroid constraint	(Best-Basis) in a stochastic multi-armed bandit game. In a Best-Basis instance, we are given n stochastic arms with unknown reward distributions, as well as a matroid \mathcalM over the arms. Let the weight of an arm be the mean of its reward distribution. Our goal is to identify a basis of \mathcalM with the maximum total weight, using as few samples as possible. The problem is a significant generalization of the best arm identification problem and the top-k arm identification problem, which have attracted significant attentions in recent years. We study both the exact and PAC versions of Best-Basis, and provide algorithms with nearly-optimal sample complexities for these versions. Our results generalize and/or improve on several previous results for the top-k arm identification problem and the combinatorial pure exploration problem when the combinatorial constraint is a matroid.
ER  -

APA


Chen, L., Gupta, A. & Li, J.. (2016). Pure Exploration of Multi-armed Bandit Under Matroid Constraints. 29th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 49:647-669 Available from https://proceedings.mlr.press/v49/chen16a.html.

Related Material

Download PDF