Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments

Yevgeny Seldin; Csaba Szepesvári; Peter Auer; Yasin Abbasi-Yadkori

Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments

Yevgeny Seldin, Csaba Szepesvári, Peter Auer, Yasin Abbasi-Yadkori

Proceedings of the Tenth European Workshop on Reinforcement Learning, PMLR 24:103-116, 2013.

Abstract

EXP3 is a popular algorithm for adversarial multiarmed bandits, suggested and analyzed in this setting by Auer et al. [2002b]. Recently there was an increased interest in the performance of this algorithm in the stochastic setting, due to its new applications to stochastic multiarmed bandits with side information [Seldin et al., 2011] and to multiarmed bandits in the mixed stochastic-adversarial setting [Bubeck and Slivkins, 2012]. We present an empirical evaluation and improved analysis of the performance of the EXP3 algorithm in the stochastic setting, as well as a modification of the EXP3 algorithm capable of achieving “logarithmic” regret in stochastic environments.

Cite this Paper

BibTeX


@InProceedings{pmlr-v24-seldin12a,
  title = 	 {Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments},
  author = 	 {Seldin, Yevgeny and Szepesvári, Csaba and Auer, Peter and Abbasi-Yadkori, Yasin},
  booktitle = 	 {Proceedings of the Tenth European Workshop on Reinforcement Learning},
  pages = 	 {103--116},
  year = 	 {2013},
  editor = 	 {Deisenroth, Marc Peter and Szepesvári, Csaba and Peters, Jan},
  volume = 	 {24},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Edinburgh, Scotland},
  month = 	 {30 Jun--01 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v24/seldin12a/seldin12a.pdf},
  url = 	 {https://proceedings.mlr.press/v24/seldin12a.html},
  abstract = 	 {EXP3 is a popular algorithm for adversarial multiarmed bandits, suggested and analyzed in this setting by Auer et al. [2002b]. Recently there was an increased interest in the performance of this algorithm in the stochastic setting, due to its new applications to stochastic multiarmed bandits with side information [Seldin et al., 2011] and to multiarmed bandits in the mixed stochastic-adversarial setting [Bubeck and Slivkins, 2012]. We present an empirical evaluation and improved analysis of the performance of the EXP3 algorithm in the stochastic setting, as well as a modification of the EXP3 algorithm capable of achieving “logarithmic” regret in stochastic environments.}
}

Endnote

%0 Conference Paper
%T Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments
%A Yevgeny Seldin
%A Csaba Szepesvári
%A Peter Auer
%A Yasin Abbasi-Yadkori
%B Proceedings of the Tenth European Workshop on Reinforcement Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Marc Peter Deisenroth
%E Csaba Szepesvári
%E Jan Peters	
%F pmlr-v24-seldin12a
%I PMLR
%P 103--116
%U https://proceedings.mlr.press/v24/seldin12a.html
%V 24
%X EXP3 is a popular algorithm for adversarial multiarmed bandits, suggested and analyzed in this setting by Auer et al. [2002b]. Recently there was an increased interest in the performance of this algorithm in the stochastic setting, due to its new applications to stochastic multiarmed bandits with side information [Seldin et al., 2011] and to multiarmed bandits in the mixed stochastic-adversarial setting [Bubeck and Slivkins, 2012]. We present an empirical evaluation and improved analysis of the performance of the EXP3 algorithm in the stochastic setting, as well as a modification of the EXP3 algorithm capable of achieving “logarithmic” regret in stochastic environments.

RIS


TY  - CPAPER
TI  - Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments
AU  - Yevgeny Seldin
AU  - Csaba Szepesvári
AU  - Peter Auer
AU  - Yasin Abbasi-Yadkori
BT  - Proceedings of the Tenth European Workshop on Reinforcement Learning
DA  - 2013/01/12
ED  - Marc Peter Deisenroth
ED  - Csaba Szepesvári
ED  - Jan Peters	
ID  - pmlr-v24-seldin12a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 24
SP  - 103
EP  - 116
L1  - http://proceedings.mlr.press/v24/seldin12a/seldin12a.pdf
UR  - https://proceedings.mlr.press/v24/seldin12a.html
AB  - EXP3 is a popular algorithm for adversarial multiarmed bandits, suggested and analyzed in this setting by Auer et al. [2002b]. Recently there was an increased interest in the performance of this algorithm in the stochastic setting, due to its new applications to stochastic multiarmed bandits with side information [Seldin et al., 2011] and to multiarmed bandits in the mixed stochastic-adversarial setting [Bubeck and Slivkins, 2012]. We present an empirical evaluation and improved analysis of the performance of the EXP3 algorithm in the stochastic setting, as well as a modification of the EXP3 algorithm capable of achieving “logarithmic” regret in stochastic environments.
ER  -

APA


Seldin, Y., Szepesvári, C., Auer, P. & Abbasi-Yadkori, Y.. (2013). Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments. Proceedings of the Tenth European Workshop on Reinforcement Learning, in Proceedings of Machine Learning Research 24:103-116 Available from https://proceedings.mlr.press/v24/seldin12a.html.

Related Material

Download PDF