Let’s be Honest: An Optimal No-Regret Framework for Zero-Sum Games

Ehsan Asadi Kangarshahi, Ya-Ping Hsieh, Mehmet Fatih Sahin, Volkan Cevher
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:2488-2496, 2018.

Abstract

We revisit the problem of solving two-player zero-sum games in the decentralized setting. We propose a simple algorithmic framework that simultaneously achieves the best rates for honest regret as well as adversarial regret, and in addition resolves the open problem of removing the logarithmic terms in convergence to the value of the game. We achieve this goal in three steps. First, we provide a novel analysis of the optimistic mirror descent (OMD), showing that it can be modified to guarantee fast convergence for both honest regret and value of the game, when the players are playing collaboratively. Second, we propose a new algorithm, dubbed as robust optimistic mirror descent (ROMD), which attains optimal adversarial regret without knowing the time horizon beforehand. Finally, we propose a simple signaling scheme, which enables us to bridge OMD and ROMD to achieve the best of both worlds. Numerical examples are presented to support our theoretical claims and show that our non-adaptive ROMD algorithm can be competitive to OMD with adaptive step-size selection.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-kangarshahi18a, title = {Let’s be Honest: An Optimal No-Regret Framework for Zero-Sum Games}, author = {Kangarshahi, Ehsan Asadi and Hsieh, Ya-Ping and Sahin, Mehmet Fatih and Cevher, Volkan}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {2488--2496}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/kangarshahi18a/kangarshahi18a.pdf}, url = {https://proceedings.mlr.press/v80/kangarshahi18a.html}, abstract = {We revisit the problem of solving two-player zero-sum games in the decentralized setting. We propose a simple algorithmic framework that simultaneously achieves the best rates for honest regret as well as adversarial regret, and in addition resolves the open problem of removing the logarithmic terms in convergence to the value of the game. We achieve this goal in three steps. First, we provide a novel analysis of the optimistic mirror descent (OMD), showing that it can be modified to guarantee fast convergence for both honest regret and value of the game, when the players are playing collaboratively. Second, we propose a new algorithm, dubbed as robust optimistic mirror descent (ROMD), which attains optimal adversarial regret without knowing the time horizon beforehand. Finally, we propose a simple signaling scheme, which enables us to bridge OMD and ROMD to achieve the best of both worlds. Numerical examples are presented to support our theoretical claims and show that our non-adaptive ROMD algorithm can be competitive to OMD with adaptive step-size selection.} }
Endnote
%0 Conference Paper %T Let’s be Honest: An Optimal No-Regret Framework for Zero-Sum Games %A Ehsan Asadi Kangarshahi %A Ya-Ping Hsieh %A Mehmet Fatih Sahin %A Volkan Cevher %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-kangarshahi18a %I PMLR %P 2488--2496 %U https://proceedings.mlr.press/v80/kangarshahi18a.html %V 80 %X We revisit the problem of solving two-player zero-sum games in the decentralized setting. We propose a simple algorithmic framework that simultaneously achieves the best rates for honest regret as well as adversarial regret, and in addition resolves the open problem of removing the logarithmic terms in convergence to the value of the game. We achieve this goal in three steps. First, we provide a novel analysis of the optimistic mirror descent (OMD), showing that it can be modified to guarantee fast convergence for both honest regret and value of the game, when the players are playing collaboratively. Second, we propose a new algorithm, dubbed as robust optimistic mirror descent (ROMD), which attains optimal adversarial regret without knowing the time horizon beforehand. Finally, we propose a simple signaling scheme, which enables us to bridge OMD and ROMD to achieve the best of both worlds. Numerical examples are presented to support our theoretical claims and show that our non-adaptive ROMD algorithm can be competitive to OMD with adaptive step-size selection.
APA
Kangarshahi, E.A., Hsieh, Y., Sahin, M.F. & Cevher, V.. (2018). Let’s be Honest: An Optimal No-Regret Framework for Zero-Sum Games. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:2488-2496 Available from https://proceedings.mlr.press/v80/kangarshahi18a.html.

Related Material