Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence

Maryam Aziz; Jesse Anderton; Emilie Kaufmann; Javed Aslam

Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence

Maryam Aziz, Jesse Anderton, Emilie Kaufmann, Javed Aslam

Proceedings of Algorithmic Learning Theory, PMLR 83:3-24, 2018.

Abstract

We consider the problem of near-optimal arm identification in the fixed confidence setting of the infinitely armed bandit problem when nothing is known about the arm reservoir distribution. We (1) introduce a PAC-like framework within which to derive and cast results; (2) derive a sample complexity lower bound for near-optimal arm identification; (3) propose an algorithm that identifies a nearly-optimal arm with high probability and derive an upper bound on its sample complexity which is within a log factor of our lower bound; and (4) discuss whether our

$\log^2 \frac{1}{δ}$ dependence is inescapable for “two-phase” (select arms first, identify the best later) algorithms in the infinite setting. This work permits the application of bandit models to a broader class of problems where fewer assumptions hold.

Cite this Paper

BibTeX


@InProceedings{pmlr-v83-aziz18a,
  title = 	 {Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence},
  author = 	 {Aziz, Maryam and Anderton, Jesse and Kaufmann, Emilie and Aslam, Javed},
  booktitle = 	 {Proceedings of Algorithmic Learning Theory},
  pages = 	 {3--24},
  year = 	 {2018},
  editor = 	 {Janoos, Firdaus and Mohri, Mehryar and Sridharan, Karthik},
  volume = 	 {83},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {07--09 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v83/aziz18a/aziz18a.pdf},
  url = 	 {https://proceedings.mlr.press/v83/aziz18a.html},
  abstract = 	 {We consider the problem of near-optimal arm identification in the fixed confidence setting of the infinitely armed bandit problem when nothing is known about the arm reservoir distribution. We (1) introduce a PAC-like framework within which to derive and cast results; (2) derive a sample complexity lower bound for near-optimal arm identification; (3) propose an algorithm that identifies a nearly-optimal arm with high probability and derive an upper bound on its sample complexity which is within a log factor of our lower bound; and (4) discuss whether our $\log^2 \frac{1}{δ}$ dependence is inescapable for “two-phase” (select arms first, identify the best later) algorithms in the infinite setting. This work permits the application of bandit models to a broader class of problems where fewer assumptions hold.}
}

Endnote

%0 Conference Paper
%T Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence
%A Maryam Aziz
%A Jesse Anderton
%A Emilie Kaufmann
%A Javed Aslam
%B Proceedings of Algorithmic Learning Theory
%C Proceedings of Machine Learning Research
%D 2018
%E Firdaus Janoos
%E Mehryar Mohri
%E Karthik Sridharan	
%F pmlr-v83-aziz18a
%I PMLR
%P 3--24
%U https://proceedings.mlr.press/v83/aziz18a.html
%V 83
%X We consider the problem of near-optimal arm identification in the fixed confidence setting of the infinitely armed bandit problem when nothing is known about the arm reservoir distribution. We (1) introduce a PAC-like framework within which to derive and cast results; (2) derive a sample complexity lower bound for near-optimal arm identification; (3) propose an algorithm that identifies a nearly-optimal arm with high probability and derive an upper bound on its sample complexity which is within a log factor of our lower bound; and (4) discuss whether our $\log^2 \frac{1}{δ}$ dependence is inescapable for “two-phase” (select arms first, identify the best later) algorithms in the infinite setting. This work permits the application of bandit models to a broader class of problems where fewer assumptions hold.

APA


Aziz, M., Anderton, J., Kaufmann, E. & Aslam, J.. (2018). Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence. Proceedings of Algorithmic Learning Theory, in Proceedings of Machine Learning Research 83:3-24 Available from https://proceedings.mlr.press/v83/aziz18a.html.

Related Material

Download PDF