Near-Optimal $Φ$-Regret Learning in Extensive-Form Games

Ioannis Anagnostides; Gabriele Farina; Tuomas Sandholm

Near-Optimal $Φ$-Regret Learning in Extensive-Form Games

Ioannis Anagnostides, Gabriele Farina, Tuomas Sandholm

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:814-839, 2023.

Abstract

In this paper, we establish efficient and uncoupled learning dynamics so that, when employed by all players in multiplayer perfect-recall imperfect-information extensive-form games, the trigger regret of each player grows as $O(\log T)$ after $T$ repetitions of play. This improves exponentially over the prior best known trigger-regret bound of $O(T^{1/4})$, and settles a recent open question by Bai et al. (2022). As an immediate consequence, we guarantee convergence to the set of extensive-form correlated equilibria and coarse correlated equilibria at a near-optimal rate of $\frac{\log T}{T}$. Building on prior work, at the heart of our construction lies a more general result regarding fixed points deriving from rational functions with polynomial degree, a property that we establish for the fixed points of (coarse) trigger deviation functions. Moreover, our construction leverages a refined regret circuit for the convex hull, which—unlike prior guarantees—preserves the RVU property introduced by Syrgkanis et al. (NIPS, 2015); this observation has an independent interest in establishing near-optimal regret under learning dynamics based on a CFR-type decomposition of the regret.

Cite this Paper

BibTeX

@InProceedings{pmlr-v202-anagnostides23a,
  title = 	 {Near-Optimal $Φ$-Regret Learning in Extensive-Form Games},
  author =       {Anagnostides, Ioannis and Farina, Gabriele and Sandholm, Tuomas},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {814--839},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/anagnostides23a/anagnostides23a.pdf},
  url = 	 {https://proceedings.mlr.press/v202/anagnostides23a.html},
  abstract = 	 {In this paper, we establish efficient and uncoupled learning dynamics so that, when employed by all players in multiplayer perfect-recall imperfect-information extensive-form games, the trigger regret of each player grows as $O(\log T)$ after $T$ repetitions of play. This improves exponentially over the prior best known trigger-regret bound of $O(T^{1/4})$, and settles a recent open question by Bai et al. (2022). As an immediate consequence, we guarantee convergence to the set of extensive-form correlated equilibria and coarse correlated equilibria at a near-optimal rate of $\frac{\log T}{T}$. Building on prior work, at the heart of our construction lies a more general result regarding fixed points deriving from rational functions with polynomial degree, a property that we establish for the fixed points of (coarse) trigger deviation functions. Moreover, our construction leverages a refined regret circuit for the convex hull, which—unlike prior guarantees—preserves the RVU property introduced by Syrgkanis et al. (NIPS, 2015); this observation has an independent interest in establishing near-optimal regret under learning dynamics based on a CFR-type decomposition of the regret.}
}

Endnote

%0 Conference Paper
%T Near-Optimal $Φ$-Regret Learning in Extensive-Form Games
%A Ioannis Anagnostides
%A Gabriele Farina
%A Tuomas Sandholm
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-anagnostides23a
%I PMLR
%P 814--839
%U https://proceedings.mlr.press/v202/anagnostides23a.html
%V 202
%X In this paper, we establish efficient and uncoupled learning dynamics so that, when employed by all players in multiplayer perfect-recall imperfect-information extensive-form games, the trigger regret of each player grows as $O(\log T)$ after $T$ repetitions of play. This improves exponentially over the prior best known trigger-regret bound of $O(T^{1/4})$, and settles a recent open question by Bai et al. (2022). As an immediate consequence, we guarantee convergence to the set of extensive-form correlated equilibria and coarse correlated equilibria at a near-optimal rate of $\frac{\log T}{T}$. Building on prior work, at the heart of our construction lies a more general result regarding fixed points deriving from rational functions with polynomial degree, a property that we establish for the fixed points of (coarse) trigger deviation functions. Moreover, our construction leverages a refined regret circuit for the convex hull, which—unlike prior guarantees—preserves the RVU property introduced by Syrgkanis et al. (NIPS, 2015); this observation has an independent interest in establishing near-optimal regret under learning dynamics based on a CFR-type decomposition of the regret.

APA

Anagnostides, I., Farina, G. & Sandholm, T.. (2023). Near-Optimal $Φ$-Regret Learning in Extensive-Form Games. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:814-839 Available from https://proceedings.mlr.press/v202/anagnostides23a.html.

Near-Optimal $Φ$-Regret Learning in Extensive-Form Games

Abstract

Cite this Paper

Related Material