Pure Exploration of Multi-armed Bandit Under Matroid Constraints

Lijie Chen, Anupam Gupta, Jian Li
; 29th Annual Conference on Learning Theory, PMLR 49:647-669, 2016.

Abstract

We study the pure exploration problem subject to a matroid constraint (Best-Basis) in a stochastic multi-armed bandit game. In a Best-Basis instance, we are given n stochastic arms with unknown reward distributions, as well as a matroid \mathcalM over the arms. Let the weight of an arm be the mean of its reward distribution. Our goal is to identify a basis of \mathcalM with the maximum total weight, using as few samples as possible. The problem is a significant generalization of the best arm identification problem and the top-k arm identification problem, which have attracted significant attentions in recent years. We study both the exact and PAC versions of Best-Basis, and provide algorithms with nearly-optimal sample complexities for these versions. Our results generalize and/or improve on several previous results for the top-k arm identification problem and the combinatorial pure exploration problem when the combinatorial constraint is a matroid.

Cite this Paper


BibTeX
@InProceedings{pmlr-v49-chen16a, title = {Pure Exploration of Multi-armed Bandit Under Matroid Constraints}, author = {Lijie Chen and Anupam Gupta and Jian Li}, booktitle = {29th Annual Conference on Learning Theory}, pages = {647--669}, year = {2016}, editor = {Vitaly Feldman and Alexander Rakhlin and Ohad Shamir}, volume = {49}, series = {Proceedings of Machine Learning Research}, address = {Columbia University, New York, New York, USA}, month = {23--26 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v49/chen16a.pdf}, url = {http://proceedings.mlr.press/v49/chen16a.html}, abstract = {We study the pure exploration problem subject to a matroid constraint (Best-Basis) in a stochastic multi-armed bandit game. In a Best-Basis instance, we are given n stochastic arms with unknown reward distributions, as well as a matroid \mathcalM over the arms. Let the weight of an arm be the mean of its reward distribution. Our goal is to identify a basis of \mathcalM with the maximum total weight, using as few samples as possible. The problem is a significant generalization of the best arm identification problem and the top-k arm identification problem, which have attracted significant attentions in recent years. We study both the exact and PAC versions of Best-Basis, and provide algorithms with nearly-optimal sample complexities for these versions. Our results generalize and/or improve on several previous results for the top-k arm identification problem and the combinatorial pure exploration problem when the combinatorial constraint is a matroid.} }
Endnote
%0 Conference Paper %T Pure Exploration of Multi-armed Bandit Under Matroid Constraints %A Lijie Chen %A Anupam Gupta %A Jian Li %B 29th Annual Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2016 %E Vitaly Feldman %E Alexander Rakhlin %E Ohad Shamir %F pmlr-v49-chen16a %I PMLR %J Proceedings of Machine Learning Research %P 647--669 %U http://proceedings.mlr.press %V 49 %W PMLR %X We study the pure exploration problem subject to a matroid constraint (Best-Basis) in a stochastic multi-armed bandit game. In a Best-Basis instance, we are given n stochastic arms with unknown reward distributions, as well as a matroid \mathcalM over the arms. Let the weight of an arm be the mean of its reward distribution. Our goal is to identify a basis of \mathcalM with the maximum total weight, using as few samples as possible. The problem is a significant generalization of the best arm identification problem and the top-k arm identification problem, which have attracted significant attentions in recent years. We study both the exact and PAC versions of Best-Basis, and provide algorithms with nearly-optimal sample complexities for these versions. Our results generalize and/or improve on several previous results for the top-k arm identification problem and the combinatorial pure exploration problem when the combinatorial constraint is a matroid.
RIS
TY - CPAPER TI - Pure Exploration of Multi-armed Bandit Under Matroid Constraints AU - Lijie Chen AU - Anupam Gupta AU - Jian Li BT - 29th Annual Conference on Learning Theory PY - 2016/06/06 DA - 2016/06/06 ED - Vitaly Feldman ED - Alexander Rakhlin ED - Ohad Shamir ID - pmlr-v49-chen16a PB - PMLR SP - 647 DP - PMLR EP - 669 L1 - http://proceedings.mlr.press/v49/chen16a.pdf UR - http://proceedings.mlr.press/v49/chen16a.html AB - We study the pure exploration problem subject to a matroid constraint (Best-Basis) in a stochastic multi-armed bandit game. In a Best-Basis instance, we are given n stochastic arms with unknown reward distributions, as well as a matroid \mathcalM over the arms. Let the weight of an arm be the mean of its reward distribution. Our goal is to identify a basis of \mathcalM with the maximum total weight, using as few samples as possible. The problem is a significant generalization of the best arm identification problem and the top-k arm identification problem, which have attracted significant attentions in recent years. We study both the exact and PAC versions of Best-Basis, and provide algorithms with nearly-optimal sample complexities for these versions. Our results generalize and/or improve on several previous results for the top-k arm identification problem and the combinatorial pure exploration problem when the combinatorial constraint is a matroid. ER -
APA
Chen, L., Gupta, A. & Li, J.. (2016). Pure Exploration of Multi-armed Bandit Under Matroid Constraints. 29th Annual Conference on Learning Theory, in PMLR 49:647-669

Related Material