Online Learning and Blackwell Approachability with Partial Monitoring: Optimal Convergence Rates

Joon Kwon, Vianney Perchet
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR 54:604-613, 2017.

Abstract

Blackwell approachability is an online learning setup generalizing the classical problem of regret minimization by allowing for instance multi-criteria optimization, global (online) optimization of a convex loss, or online linear optimization under some cumulative constraint. We consider partial monitoring where the decision maker does not necessarily observe the outcomes of his decision (unlike the traditional regret/bandit literature). Instead, he receives a random signal correlated to the decision–outcome pair, or only to the outcome. We construct, for the first time, approachability algorithms with convergence rate of order $O(T^-1/2)$ when the signal is independent of the decision and of order $O(T^-1/3)$ in the case of general signals. Those rates are optimal in the sense that they cannot be improved without further assumption on the structure of the objectives and/or the signals.

Cite this Paper


BibTeX
@InProceedings{pmlr-v54-kwon17a, title = {{Online Learning and Blackwell Approachability with Partial Monitoring: Optimal Convergence Rates}}, author = {Kwon, Joon and Perchet, Vianney}, booktitle = {Proceedings of the 20th International Conference on Artificial Intelligence and Statistics}, pages = {604--613}, year = {2017}, editor = {Singh, Aarti and Zhu, Jerry}, volume = {54}, series = {Proceedings of Machine Learning Research}, month = {20--22 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v54/kwon17a/kwon17a.pdf}, url = {https://proceedings.mlr.press/v54/kwon17a.html}, abstract = {Blackwell approachability is an online learning setup generalizing the classical problem of regret minimization by allowing for instance multi-criteria optimization, global (online) optimization of a convex loss, or online linear optimization under some cumulative constraint. We consider partial monitoring where the decision maker does not necessarily observe the outcomes of his decision (unlike the traditional regret/bandit literature). Instead, he receives a random signal correlated to the decision–outcome pair, or only to the outcome. We construct, for the first time, approachability algorithms with convergence rate of order $O(T^-1/2)$ when the signal is independent of the decision and of order $O(T^-1/3)$ in the case of general signals. Those rates are optimal in the sense that they cannot be improved without further assumption on the structure of the objectives and/or the signals.} }
Endnote
%0 Conference Paper %T Online Learning and Blackwell Approachability with Partial Monitoring: Optimal Convergence Rates %A Joon Kwon %A Vianney Perchet %B Proceedings of the 20th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2017 %E Aarti Singh %E Jerry Zhu %F pmlr-v54-kwon17a %I PMLR %P 604--613 %U https://proceedings.mlr.press/v54/kwon17a.html %V 54 %X Blackwell approachability is an online learning setup generalizing the classical problem of regret minimization by allowing for instance multi-criteria optimization, global (online) optimization of a convex loss, or online linear optimization under some cumulative constraint. We consider partial monitoring where the decision maker does not necessarily observe the outcomes of his decision (unlike the traditional regret/bandit literature). Instead, he receives a random signal correlated to the decision–outcome pair, or only to the outcome. We construct, for the first time, approachability algorithms with convergence rate of order $O(T^-1/2)$ when the signal is independent of the decision and of order $O(T^-1/3)$ in the case of general signals. Those rates are optimal in the sense that they cannot be improved without further assumption on the structure of the objectives and/or the signals.
APA
Kwon, J. & Perchet, V.. (2017). Online Learning and Blackwell Approachability with Partial Monitoring: Optimal Convergence Rates. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 54:604-613 Available from https://proceedings.mlr.press/v54/kwon17a.html.

Related Material