Online-to-Confidence-Set Conversions and Application to Sparse Stochastic Bandits

Yasin Abbasi-Yadkori; David Pal; Csaba Szepesvari

Online-to-Confidence-Set Conversions and Application to Sparse Stochastic Bandits

Yasin Abbasi-Yadkori, David Pal, Csaba Szepesvari

Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:1-9, 2012.

Abstract

We introduce a novel technique, which we call online-to-confidence-set conversion. The technique allows us to construct high-probability confidence sets for linear prediction with correlated inputs given the predictions of any algorithm (e.g., online LASSO, exponentiated gradient algorithm, online least-squares, p-norm algorithm) targeting online learning with linear predictors and the quadratic loss. By construction, the size of the confidence set is directly governed by the regret of the online learning algorithm. Constructing tight confidence sets is interesting on its own, but the new technique is given extra weight by the fact having access tight confidence sets underlies a number of important problems. The advantage of our construction here is that progress in constructing better algorithms for online prediction problems directly translates into tighter confidence sets. In this paper, this is demonstrated in the case of linear stochastic bandits. In particular, we introduce the sparse variant of linear stochastic bandits and show that a recent online algorithm together with our online-to-confidence-set conversion allows one to derive algorithms that can exploit if the reward is a function of a sparse linear combination of the components of the chosen action.

Cite this Paper

BibTeX


@InProceedings{pmlr-v22-abbasi-yadkori12,
  title = 	 {Online-to-Confidence-Set Conversions and Application to Sparse Stochastic Bandits},
  author = 	 {Abbasi-Yadkori, Yasin and Pal, David and Szepesvari, Csaba},
  booktitle = 	 {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1--9},
  year = 	 {2012},
  editor = 	 {Lawrence, Neil D. and Girolami, Mark},
  volume = 	 {22},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {La Palma, Canary Islands},
  month = 	 {21--23 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v22/abbasi-yadkori12/abbasi-yadkori12.pdf},
  url = 	 {https://proceedings.mlr.press/v22/abbasi-yadkori12.html},
  abstract = 	 {We introduce a novel technique, which we call  online-to-confidence-set conversion. The technique allows us to construct high-probability confidence sets for linear prediction  with correlated inputs given the predictions of any algorithm   (e.g., online LASSO, exponentiated gradient algorithm,   online least-squares, p-norm algorithm)   targeting online learning  with linear predictors and the quadratic loss. By construction, the size of the confidence set is   directly governed by the regret of the online learning algorithm. Constructing tight confidence sets is interesting on its own, but  the new technique is given extra weight by the fact   having access tight confidence sets underlies a number of    important problems. The advantage of our construction here is that  progress in constructing better algorithms for online prediction problems   directly translates into tighter confidence sets. In this paper, this is demonstrated in the case of linear stochastic bandits. In particular, we introduce the sparse variant of linear stochastic  bandits and show that a recent online algorithm together with our  online-to-confidence-set conversion allows one to derive  algorithms that can exploit if the reward is a function of a sparse   linear combination of the components of the chosen action.}
}

Endnote

%0 Conference Paper
%T Online-to-Confidence-Set Conversions and Application to Sparse Stochastic Bandits
%A Yasin Abbasi-Yadkori
%A David Pal
%A Csaba Szepesvari
%B Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2012
%E Neil D. Lawrence
%E Mark Girolami	
%F pmlr-v22-abbasi-yadkori12
%I PMLR
%P 1--9
%U https://proceedings.mlr.press/v22/abbasi-yadkori12.html
%V 22
%X We introduce a novel technique, which we call  online-to-confidence-set conversion. The technique allows us to construct high-probability confidence sets for linear prediction  with correlated inputs given the predictions of any algorithm   (e.g., online LASSO, exponentiated gradient algorithm,   online least-squares, p-norm algorithm)   targeting online learning  with linear predictors and the quadratic loss. By construction, the size of the confidence set is   directly governed by the regret of the online learning algorithm. Constructing tight confidence sets is interesting on its own, but  the new technique is given extra weight by the fact   having access tight confidence sets underlies a number of    important problems. The advantage of our construction here is that  progress in constructing better algorithms for online prediction problems   directly translates into tighter confidence sets. In this paper, this is demonstrated in the case of linear stochastic bandits. In particular, we introduce the sparse variant of linear stochastic  bandits and show that a recent online algorithm together with our  online-to-confidence-set conversion allows one to derive  algorithms that can exploit if the reward is a function of a sparse   linear combination of the components of the chosen action.

RIS


TY  - CPAPER
TI  - Online-to-Confidence-Set Conversions and Application to Sparse Stochastic Bandits
AU  - Yasin Abbasi-Yadkori
AU  - David Pal
AU  - Csaba Szepesvari
BT  - Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics
DA  - 2012/03/21
ED  - Neil D. Lawrence
ED  - Mark Girolami	
ID  - pmlr-v22-abbasi-yadkori12
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 22
SP  - 1
EP  - 9
L1  - http://proceedings.mlr.press/v22/abbasi-yadkori12/abbasi-yadkori12.pdf
UR  - https://proceedings.mlr.press/v22/abbasi-yadkori12.html
AB  - We introduce a novel technique, which we call  online-to-confidence-set conversion. The technique allows us to construct high-probability confidence sets for linear prediction  with correlated inputs given the predictions of any algorithm   (e.g., online LASSO, exponentiated gradient algorithm,   online least-squares, p-norm algorithm)   targeting online learning  with linear predictors and the quadratic loss. By construction, the size of the confidence set is   directly governed by the regret of the online learning algorithm. Constructing tight confidence sets is interesting on its own, but  the new technique is given extra weight by the fact   having access tight confidence sets underlies a number of    important problems. The advantage of our construction here is that  progress in constructing better algorithms for online prediction problems   directly translates into tighter confidence sets. In this paper, this is demonstrated in the case of linear stochastic bandits. In particular, we introduce the sparse variant of linear stochastic  bandits and show that a recent online algorithm together with our  online-to-confidence-set conversion allows one to derive  algorithms that can exploit if the reward is a function of a sparse   linear combination of the components of the chosen action.
ER  -

APA


Abbasi-Yadkori, Y., Pal, D. & Szepesvari, C.. (2012). Online-to-Confidence-Set Conversions and Application to Sparse Stochastic Bandits. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 22:1-9 Available from https://proceedings.mlr.press/v22/abbasi-yadkori12.html.

Online-to-Confidence-Set Conversions and Application to Sparse Stochastic Bandits

Abstract

Cite this Paper

Related Material