Large-Scale Bandit Problems and KWIK Learning

Jacob Abernethy; Kareem Amin; Michael Kearns; Moez Draief

Large-Scale Bandit Problems and KWIK Learning

Jacob Abernethy, Kareem Amin, Michael Kearns, Moez Draief

Proceedings of the 30th International Conference on Machine Learning, PMLR 28(1):588-596, 2013.

Abstract

We show that parametric multi-armed bandit (MAB) problems with large state and action spaces can be algorithmically reduced to the supervised learning model known as Knows What It Knows or KWIK learning. We give matching impossibility results showing that the KWIK learnability requirement cannot be replaced by weaker supervised learning assumptions. We provide such results in both the standard parametric MAB setting, as well as for a new model in which the action space is finite but growing with time.

Cite this Paper

BibTeX


@InProceedings{pmlr-v28-abernethy13,
  title = 	 {Large-Scale Bandit Problems and {KWIK} Learning},
  author = 	 {Abernethy, Jacob and Amin, Kareem and Kearns, Michael and Draief, Moez},
  booktitle = 	 {Proceedings of the 30th International Conference on Machine Learning},
  pages = 	 {588--596},
  year = 	 {2013},
  editor = 	 {Dasgupta, Sanjoy and McAllester, David},
  volume = 	 {28},
  number =       {1},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Atlanta, Georgia, USA},
  month = 	 {17--19 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v28/abernethy13.pdf},
  url = 	 {https://proceedings.mlr.press/v28/abernethy13.html},
  abstract = 	 {We show that parametric multi-armed bandit (MAB) problems with large state and action spaces can be algorithmically reduced to the supervised learning model known as Knows What It Knows or KWIK learning. We give matching impossibility results showing that the KWIK learnability requirement cannot be replaced by weaker supervised learning assumptions. We provide such results in both the standard parametric MAB setting, as well as for a new model in which the action space is finite but growing with time.}
}

Endnote

%0 Conference Paper
%T Large-Scale Bandit Problems and KWIK Learning
%A Jacob Abernethy
%A Kareem Amin
%A Michael Kearns
%A Moez Draief
%B Proceedings of the 30th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Sanjoy Dasgupta
%E David McAllester	
%F pmlr-v28-abernethy13
%I PMLR
%P 588--596
%U https://proceedings.mlr.press/v28/abernethy13.html
%V 28
%N 1
%X We show that parametric multi-armed bandit (MAB) problems with large state and action spaces can be algorithmically reduced to the supervised learning model known as Knows What It Knows or KWIK learning. We give matching impossibility results showing that the KWIK learnability requirement cannot be replaced by weaker supervised learning assumptions. We provide such results in both the standard parametric MAB setting, as well as for a new model in which the action space is finite but growing with time.

RIS


TY  - CPAPER
TI  - Large-Scale Bandit Problems and KWIK Learning
AU  - Jacob Abernethy
AU  - Kareem Amin
AU  - Michael Kearns
AU  - Moez Draief
BT  - Proceedings of the 30th International Conference on Machine Learning
DA  - 2013/02/13
ED  - Sanjoy Dasgupta
ED  - David McAllester	
ID  - pmlr-v28-abernethy13
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 28
IS  - 1
SP  - 588
EP  - 596
L1  - http://proceedings.mlr.press/v28/abernethy13.pdf
UR  - https://proceedings.mlr.press/v28/abernethy13.html
AB  - We show that parametric multi-armed bandit (MAB) problems with large state and action spaces can be algorithmically reduced to the supervised learning model known as Knows What It Knows or KWIK learning. We give matching impossibility results showing that the KWIK learnability requirement cannot be replaced by weaker supervised learning assumptions. We provide such results in both the standard parametric MAB setting, as well as for a new model in which the action space is finite but growing with time.
ER  -

APA


Abernethy, J., Amin, K., Kearns, M. & Draief, M.. (2013). Large-Scale Bandit Problems and KWIK Learning. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(1):588-596 Available from https://proceedings.mlr.press/v28/abernethy13.html.

Large-Scale Bandit Problems and KWIK Learning

Abstract

Cite this Paper

Related Material