Open Problem: Best Arm Identification: Almost Instance-Wise Optimality and the Gap Entropy Conjecture

Lijie Chen; Jian Li

Open Problem: Best Arm Identification: Almost Instance-Wise Optimality and the Gap Entropy Conjecture

Lijie Chen, Jian Li

29th Annual Conference on Learning Theory, PMLR 49:1643-1646, 2016.

Abstract

The best arm identification problem (BEST-1-ARM) is the most basic pure exploration problem in stochastic multi-armed bandits. The problem has a long history and attracted significant attention for the last decade. However, we do not yet have a complete understanding of the optimal sample complexity of the problem: The state-of-the-art algorithms achieve a sample complexity of O(\sum_i=2^n \Delta_i^-2(\lnδ^-1 + \ln\ln\Delta_i^-1)) (\Delta_i is the difference between the largest mean and the i^th mean), while the best known lower bound is Ω(\sum_i=2^n \Delta_i^-2\lnδ^-1) for general instances and Ω(∆^-2 \ln\ln ∆^-1) for the two-arm instances. We propose to study the instance-wise optimality for the BEST-1-ARM problem. Previous work has proved that it is impossible to have an instance optimal algorithm for the 2-arm problem. However, we conjecture that modulo the additive term Ω(\Delta_2^-2 \ln\ln \Delta_2^-1) (which is an upper bound and worst case lower bound for the 2-arm problem), there is an instance optimal algorithm for BEST-1-ARM. Moreover, we introduce a new quantity, called the gap entropy for a best-arm problem instance, and conjecture that it is the instance-wise lower bound. Hence, resolving this conjecture would provide a final answer to the old and basic problem.

Cite this Paper

BibTeX


@InProceedings{pmlr-v49-chen16b,
  title = 	 {Open Problem: Best Arm Identification: Almost Instance-Wise Optimality and the Gap Entropy Conjecture},
  author = 	 {Chen, Lijie and Li, Jian},
  booktitle = 	 {29th Annual Conference on Learning Theory},
  pages = 	 {1643--1646},
  year = 	 {2016},
  editor = 	 {Feldman, Vitaly and Rakhlin, Alexander and Shamir, Ohad},
  volume = 	 {49},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Columbia University, New York, New York, USA},
  month = 	 {23--26 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v49/chen16b.pdf},
  url = 	 {https://proceedings.mlr.press/v49/chen16b.html},
  abstract = 	 {The best arm identification problem (BEST-1-ARM) is the most basic pure exploration problem in stochastic multi-armed bandits. The problem has a long history and attracted significant attention for the last decade. However, we do not yet have a complete understanding of the optimal sample complexity of the problem: The state-of-the-art algorithms achieve a sample complexity of O(\sum_i=2^n \Delta_i^-2(\lnδ^-1 + \ln\ln\Delta_i^-1)) (\Delta_i is the difference between the largest mean and the i^th mean), while the best known lower bound is Ω(\sum_i=2^n \Delta_i^-2\lnδ^-1) for general instances and Ω(∆^-2 \ln\ln ∆^-1) for the two-arm instances. We propose to study the instance-wise optimality for the BEST-1-ARM problem. Previous work has proved that it is impossible to have an instance optimal algorithm for the 2-arm problem. However, we conjecture that modulo the additive term Ω(\Delta_2^-2 \ln\ln \Delta_2^-1) (which is an upper bound and worst case lower bound for the 2-arm problem), there is an instance optimal algorithm for BEST-1-ARM. Moreover, we introduce a new quantity, called the gap entropy for a best-arm problem instance, and conjecture that it is the instance-wise lower bound. Hence, resolving this conjecture would provide a final answer to the old and basic problem.}
}

Endnote

%0 Conference Paper
%T Open Problem: Best Arm Identification: Almost Instance-Wise Optimality and the Gap Entropy Conjecture
%A Lijie Chen
%A Jian Li
%B 29th Annual Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2016
%E Vitaly Feldman
%E Alexander Rakhlin
%E Ohad Shamir	
%F pmlr-v49-chen16b
%I PMLR
%P 1643--1646
%U https://proceedings.mlr.press/v49/chen16b.html
%V 49
%X The best arm identification problem (BEST-1-ARM) is the most basic pure exploration problem in stochastic multi-armed bandits. The problem has a long history and attracted significant attention for the last decade. However, we do not yet have a complete understanding of the optimal sample complexity of the problem: The state-of-the-art algorithms achieve a sample complexity of O(\sum_i=2^n \Delta_i^-2(\lnδ^-1 + \ln\ln\Delta_i^-1)) (\Delta_i is the difference between the largest mean and the i^th mean), while the best known lower bound is Ω(\sum_i=2^n \Delta_i^-2\lnδ^-1) for general instances and Ω(∆^-2 \ln\ln ∆^-1) for the two-arm instances. We propose to study the instance-wise optimality for the BEST-1-ARM problem. Previous work has proved that it is impossible to have an instance optimal algorithm for the 2-arm problem. However, we conjecture that modulo the additive term Ω(\Delta_2^-2 \ln\ln \Delta_2^-1) (which is an upper bound and worst case lower bound for the 2-arm problem), there is an instance optimal algorithm for BEST-1-ARM. Moreover, we introduce a new quantity, called the gap entropy for a best-arm problem instance, and conjecture that it is the instance-wise lower bound. Hence, resolving this conjecture would provide a final answer to the old and basic problem.

RIS


TY  - CPAPER
TI  - Open Problem: Best Arm Identification: Almost Instance-Wise Optimality and the Gap Entropy Conjecture
AU  - Lijie Chen
AU  - Jian Li
BT  - 29th Annual Conference on Learning Theory
DA  - 2016/06/06
ED  - Vitaly Feldman
ED  - Alexander Rakhlin
ED  - Ohad Shamir	
ID  - pmlr-v49-chen16b
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 49
SP  - 1643
EP  - 1646
L1  - http://proceedings.mlr.press/v49/chen16b.pdf
UR  - https://proceedings.mlr.press/v49/chen16b.html
AB  - The best arm identification problem (BEST-1-ARM) is the most basic pure exploration problem in stochastic multi-armed bandits. The problem has a long history and attracted significant attention for the last decade. However, we do not yet have a complete understanding of the optimal sample complexity of the problem: The state-of-the-art algorithms achieve a sample complexity of O(\sum_i=2^n \Delta_i^-2(\lnδ^-1 + \ln\ln\Delta_i^-1)) (\Delta_i is the difference between the largest mean and the i^th mean), while the best known lower bound is Ω(\sum_i=2^n \Delta_i^-2\lnδ^-1) for general instances and Ω(∆^-2 \ln\ln ∆^-1) for the two-arm instances. We propose to study the instance-wise optimality for the BEST-1-ARM problem. Previous work has proved that it is impossible to have an instance optimal algorithm for the 2-arm problem. However, we conjecture that modulo the additive term Ω(\Delta_2^-2 \ln\ln \Delta_2^-1) (which is an upper bound and worst case lower bound for the 2-arm problem), there is an instance optimal algorithm for BEST-1-ARM. Moreover, we introduce a new quantity, called the gap entropy for a best-arm problem instance, and conjecture that it is the instance-wise lower bound. Hence, resolving this conjecture would provide a final answer to the old and basic problem.
ER  -

APA


Chen, L. & Li, J.. (2016). Open Problem: Best Arm Identification: Almost Instance-Wise Optimality and the Gap Entropy Conjecture. 29th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 49:1643-1646 Available from https://proceedings.mlr.press/v49/chen16b.html.

Related Material

Download PDF