Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information

Peter Auer; Yifang Chen; Pratik Gajane; Chung-Wei Lee; Haipeng Luo; Ronald Ortner; Chen-Yu Wei

Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information

Peter Auer, Yifang Chen, Pratik Gajane, Chung-Wei Lee, Haipeng Luo, Ronald Ortner, Chen-Yu Wei

Proceedings of the Thirty-Second Conference on Learning Theory, PMLR 99:159-163, 2019.

Abstract

This joint extended abstract introduces and compares the results of (Auer et al., 2019) and (Chen et al., 2019), both of which resolve the problem of achieving optimal dynamic regret for non-stationary bandits without prior information on the non-stationarity. Specifically, Auer et al. (2019) resolve the problem for the traditional multi-armed bandits setting, while Chen et al. (2019) give a solution for the more general contextual bandits setting. Both works extend the key idea of (Auer et al., 2018) developed for a simpler two-armed setting.

Cite this Paper

BibTeX


@InProceedings{pmlr-v99-auer19b,
  title = 	 {Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information},
  author =       {Auer, Peter and Chen, Yifang and Gajane, Pratik and Lee, Chung-Wei and Luo, Haipeng and Ortner, Ronald and Wei, Chen-Yu},
  booktitle = 	 {Proceedings of the Thirty-Second Conference on Learning Theory},
  pages = 	 {159--163},
  year = 	 {2019},
  editor = 	 {Beygelzimer, Alina and Hsu, Daniel},
  volume = 	 {99},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--28 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v99/auer19b/auer19b.pdf},
  url = 	 {https://proceedings.mlr.press/v99/auer19b.html},
  abstract = 	 {This joint extended abstract introduces and compares the results of (Auer et al., 2019) and (Chen et al., 2019), both of which resolve the problem of achieving optimal dynamic regret for non-stationary bandits without prior information on the non-stationarity. Specifically, Auer et al. (2019) resolve the problem for the traditional multi-armed bandits setting, while Chen et al. (2019) give a solution for the more general contextual bandits setting. Both works extend the key idea of (Auer et al., 2018) developed for a simpler two-armed setting.}
}

Endnote

%0 Conference Paper
%T Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information
%A Peter Auer
%A Yifang Chen
%A Pratik Gajane
%A Chung-Wei Lee
%A Haipeng Luo
%A Ronald Ortner
%A Chen-Yu Wei
%B Proceedings of the Thirty-Second Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2019
%E Alina Beygelzimer
%E Daniel Hsu	
%F pmlr-v99-auer19b
%I PMLR
%P 159--163
%U https://proceedings.mlr.press/v99/auer19b.html
%V 99
%X This joint extended abstract introduces and compares the results of (Auer et al., 2019) and (Chen et al., 2019), both of which resolve the problem of achieving optimal dynamic regret for non-stationary bandits without prior information on the non-stationarity. Specifically, Auer et al. (2019) resolve the problem for the traditional multi-armed bandits setting, while Chen et al. (2019) give a solution for the more general contextual bandits setting. Both works extend the key idea of (Auer et al., 2018) developed for a simpler two-armed setting.

APA


Auer, P., Chen, Y., Gajane, P., Lee, C., Luo, H., Ortner, R. & Wei, C.. (2019). Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information. Proceedings of the Thirty-Second Conference on Learning Theory, in Proceedings of Machine Learning Research 99:159-163 Available from https://proceedings.mlr.press/v99/auer19b.html.

Related Material

Download PDF