Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers

Luke Marris; Paul Muller; Marc Lanctot; Karl Tuyls; Thore Graepel

Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers

Luke Marris, Paul Muller, Marc Lanctot, Karl Tuyls, Thore Graepel

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:7480-7491, 2021.

Abstract

Two-player, constant-sum games are well studied in the literature, but there has been limited progress outside of this setting. We propose Joint Policy-Space Response Oracles (JPSRO), an algorithm for training agents in n-player, general-sum extensive form games, which provably converges to an equilibrium. We further suggest correlated equilibria (CE) as promising meta-solvers, and propose a novel solution concept Maximum Gini Correlated Equilibrium (MGCE), a principled and computationally efficient family of solutions for solving the correlated equilibrium selection problem. We conduct several experiments using CE meta-solvers for JPSRO and demonstrate convergence on n-player, general-sum games.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-marris21a,
  title = 	 {Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers},
  author =       {Marris, Luke and Muller, Paul and Lanctot, Marc and Tuyls, Karl and Graepel, Thore},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {7480--7491},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/marris21a/marris21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/marris21a.html},
  abstract = 	 {Two-player, constant-sum games are well studied in the literature, but there has been limited progress outside of this setting. We propose Joint Policy-Space Response Oracles (JPSRO), an algorithm for training agents in n-player, general-sum extensive form games, which provably converges to an equilibrium. We further suggest correlated equilibria (CE) as promising meta-solvers, and propose a novel solution concept Maximum Gini Correlated Equilibrium (MGCE), a principled and computationally efficient family of solutions for solving the correlated equilibrium selection problem. We conduct several experiments using CE meta-solvers for JPSRO and demonstrate convergence on n-player, general-sum games.}
}

Endnote

%0 Conference Paper
%T Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers
%A Luke Marris
%A Paul Muller
%A Marc Lanctot
%A Karl Tuyls
%A Thore Graepel
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-marris21a
%I PMLR
%P 7480--7491
%U https://proceedings.mlr.press/v139/marris21a.html
%V 139
%X Two-player, constant-sum games are well studied in the literature, but there has been limited progress outside of this setting. We propose Joint Policy-Space Response Oracles (JPSRO), an algorithm for training agents in n-player, general-sum extensive form games, which provably converges to an equilibrium. We further suggest correlated equilibria (CE) as promising meta-solvers, and propose a novel solution concept Maximum Gini Correlated Equilibrium (MGCE), a principled and computationally efficient family of solutions for solving the correlated equilibrium selection problem. We conduct several experiments using CE meta-solvers for JPSRO and demonstrate convergence on n-player, general-sum games.

APA

Marris, L., Muller, P., Lanctot, M., Tuyls, K. & Graepel, T.. (2021). Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:7480-7491 Available from https://proceedings.mlr.press/v139/marris21a.html.

Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers

Abstract

Cite this Paper

Related Material