Cleaning up the neighborhood: A full classification for adversarial partial monitoring

Tor Lattimore, Csaba Szepesvári
; Proceedings of the 30th International Conference on Algorithmic Learning Theory, PMLR 98:529-556, 2019.

Abstract

Partial monitoring is a generalization of the well-known multi-armed bandit framework where the loss is not directly observed by the learner. We complete the classification of finite adversarial partial monitoring to include all games, solving an open problem posed by Bartok et al. (2014). Along the way we simplify and improve existing algorithms and correct errors in previous analyses. Our second contribution is a new algorithm for the class of games studied by Bartok (2013) where we prove upper and lower regret bounds that shed more light on the dependence of the regret on the game structure.

Cite this Paper


BibTeX
@InProceedings{pmlr-v98-lattimore19a, title = {Cleaning up the neighborhood: A full classification for adversarial partial monitoring}, author = {Lattimore, Tor and Szepesv{\'a}ri, Csaba}, booktitle = {Proceedings of the 30th International Conference on Algorithmic Learning Theory}, pages = {529--556}, year = {2019}, editor = {Aurélien Garivier and Satyen Kale}, volume = {98}, series = {Proceedings of Machine Learning Research}, address = {Chicago, Illinois}, month = {22--24 Mar}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v98/lattimore19a/lattimore19a.pdf}, url = {http://proceedings.mlr.press/v98/lattimore19a.html}, abstract = {Partial monitoring is a generalization of the well-known multi-armed bandit framework where the loss is not directly observed by the learner. We complete the classification of finite adversarial partial monitoring to include all games, solving an open problem posed by Bartok et al. (2014). Along the way we simplify and improve existing algorithms and correct errors in previous analyses. Our second contribution is a new algorithm for the class of games studied by Bartok (2013) where we prove upper and lower regret bounds that shed more light on the dependence of the regret on the game structure.} }
Endnote
%0 Conference Paper %T Cleaning up the neighborhood: A full classification for adversarial partial monitoring %A Tor Lattimore %A Csaba Szepesvári %B Proceedings of the 30th International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2019 %E Aurélien Garivier %E Satyen Kale %F pmlr-v98-lattimore19a %I PMLR %J Proceedings of Machine Learning Research %P 529--556 %U http://proceedings.mlr.press %V 98 %W PMLR %X Partial monitoring is a generalization of the well-known multi-armed bandit framework where the loss is not directly observed by the learner. We complete the classification of finite adversarial partial monitoring to include all games, solving an open problem posed by Bartok et al. (2014). Along the way we simplify and improve existing algorithms and correct errors in previous analyses. Our second contribution is a new algorithm for the class of games studied by Bartok (2013) where we prove upper and lower regret bounds that shed more light on the dependence of the regret on the game structure.
APA
Lattimore, T. & Szepesvári, C.. (2019). Cleaning up the neighborhood: A full classification for adversarial partial monitoring. Proceedings of the 30th International Conference on Algorithmic Learning Theory, in PMLR 98:529-556

Related Material