Abstracting Imperfect Information Away from Two-Player Zero-Sum Games

Samuel Sokota, Ryan D’Orazio, Chun Kai Ling, David J Wu, J Zico Kolter, Noam Brown
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:32169-32193, 2023.

Abstract

In their seminal work, Nayyar et al. (2013) showed that imperfect information can be abstracted away from common-payoff games by having players publicly announce their policies as they play. This insight underpins sound solvers and decision-time planning algorithms for common-payoff games. Unfortunately, a naive application of the same insight to two-player zero-sum games fails because Nash equilibria of the game with public policy announcements may not correspond to Nash equilibria of the original game. As a consequence, existing sound decision-time planning algorithms require complicated additional mechanisms that have unappealing properties. The main contribution of this work is showing that certain regularized equilibria do not possess the aforementioned non-correspondence problem—thus, computing them can be treated as perfect-information problems. Because these regularized equilibria can be made arbitrarily close to Nash equilibria, our result opens the door to a new perspective to solving two-player zero-sum games and yields a simplified framework for decision-time planning in two-player zero-sum games, void of the unappealing properties that plague existing decision-time planning approaches.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-sokota23a, title = {Abstracting Imperfect Information Away from Two-Player Zero-Sum Games}, author = {Sokota, Samuel and D'Orazio, Ryan and Ling, Chun Kai and Wu, David J and Kolter, J Zico and Brown, Noam}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {32169--32193}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/sokota23a/sokota23a.pdf}, url = {https://proceedings.mlr.press/v202/sokota23a.html}, abstract = {In their seminal work, Nayyar et al. (2013) showed that imperfect information can be abstracted away from common-payoff games by having players publicly announce their policies as they play. This insight underpins sound solvers and decision-time planning algorithms for common-payoff games. Unfortunately, a naive application of the same insight to two-player zero-sum games fails because Nash equilibria of the game with public policy announcements may not correspond to Nash equilibria of the original game. As a consequence, existing sound decision-time planning algorithms require complicated additional mechanisms that have unappealing properties. The main contribution of this work is showing that certain regularized equilibria do not possess the aforementioned non-correspondence problem—thus, computing them can be treated as perfect-information problems. Because these regularized equilibria can be made arbitrarily close to Nash equilibria, our result opens the door to a new perspective to solving two-player zero-sum games and yields a simplified framework for decision-time planning in two-player zero-sum games, void of the unappealing properties that plague existing decision-time planning approaches.} }
Endnote
%0 Conference Paper %T Abstracting Imperfect Information Away from Two-Player Zero-Sum Games %A Samuel Sokota %A Ryan D’Orazio %A Chun Kai Ling %A David J Wu %A J Zico Kolter %A Noam Brown %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-sokota23a %I PMLR %P 32169--32193 %U https://proceedings.mlr.press/v202/sokota23a.html %V 202 %X In their seminal work, Nayyar et al. (2013) showed that imperfect information can be abstracted away from common-payoff games by having players publicly announce their policies as they play. This insight underpins sound solvers and decision-time planning algorithms for common-payoff games. Unfortunately, a naive application of the same insight to two-player zero-sum games fails because Nash equilibria of the game with public policy announcements may not correspond to Nash equilibria of the original game. As a consequence, existing sound decision-time planning algorithms require complicated additional mechanisms that have unappealing properties. The main contribution of this work is showing that certain regularized equilibria do not possess the aforementioned non-correspondence problem—thus, computing them can be treated as perfect-information problems. Because these regularized equilibria can be made arbitrarily close to Nash equilibria, our result opens the door to a new perspective to solving two-player zero-sum games and yields a simplified framework for decision-time planning in two-player zero-sum games, void of the unappealing properties that plague existing decision-time planning approaches.
APA
Sokota, S., D’Orazio, R., Ling, C.K., Wu, D.J., Kolter, J.Z. & Brown, N.. (2023). Abstracting Imperfect Information Away from Two-Player Zero-Sum Games. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:32169-32193 Available from https://proceedings.mlr.press/v202/sokota23a.html.

Related Material