Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games

Yang Cai; Haipeng Luo; Chen-Yu Wei; Weiqiang Zheng

Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games

Yang Cai, Haipeng Luo, Chen-Yu Wei, Weiqiang Zheng

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:3889-3897, 2024.

Abstract

We study policy optimization algorithms for computing correlated equilibria in multi-player general-sum Markov Games. Previous results achieve

$\tilde{O}(T^{-1/2})$ convergence rate to a correlated equilibrium and an accelerated

$\tilde{O}(T^{-3/4})$ convergence rate to the weaker notion of coarse correlated equilibrium. In this paper, we improve both results significantly by providing an uncoupled policy optimization algorithm that attains a near-optimal

$\tilde{O}(T^{-1})$ convergence rate for computing a correlated equilibrium. Our algorithm is constructed by combining two main elements (i) smooth value updates and (ii) the \emph{optimistic-follow-the-regularized-leader} algorithm with the log barrier regularizer.

Cite this Paper

BibTeX

@InProceedings{pmlr-v238-cai24a,
  title = 	 {Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum {M}arkov Games},
  author =       {Cai, Yang and Luo, Haipeng and Wei, Chen-Yu and Zheng, Weiqiang},
  booktitle = 	 {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {3889--3897},
  year = 	 {2024},
  editor = 	 {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen},
  volume = 	 {238},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {02--04 May},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v238/cai24a/cai24a.pdf},
  url = 	 {https://proceedings.mlr.press/v238/cai24a.html},
  abstract = 	 {We study policy optimization algorithms for computing correlated equilibria in multi-player general-sum Markov Games. Previous results achieve $\tilde{O}(T^{-1/2})$ convergence rate to a correlated equilibrium and an accelerated $\tilde{O}(T^{-3/4})$ convergence rate to the weaker notion of coarse correlated equilibrium. In this paper, we improve both results significantly by providing an uncoupled policy optimization algorithm that attains a near-optimal $\tilde{O}(T^{-1})$ convergence rate for computing a correlated equilibrium. Our algorithm is constructed by combining two main elements (i) smooth value updates and (ii) the \emph{optimistic-follow-the-regularized-leader} algorithm with the log barrier regularizer.}
}

Endnote

%0 Conference Paper
%T Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games
%A Yang Cai
%A Haipeng Luo
%A Chen-Yu Wei
%A Weiqiang Zheng
%B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2024
%E Sanjoy Dasgupta
%E Stephan Mandt
%E Yingzhen Li	
%F pmlr-v238-cai24a
%I PMLR
%P 3889--3897
%U https://proceedings.mlr.press/v238/cai24a.html
%V 238
%X We study policy optimization algorithms for computing correlated equilibria in multi-player general-sum Markov Games. Previous results achieve $\tilde{O}(T^{-1/2})$ convergence rate to a correlated equilibrium and an accelerated $\tilde{O}(T^{-3/4})$ convergence rate to the weaker notion of coarse correlated equilibrium. In this paper, we improve both results significantly by providing an uncoupled policy optimization algorithm that attains a near-optimal $\tilde{O}(T^{-1})$ convergence rate for computing a correlated equilibrium. Our algorithm is constructed by combining two main elements (i) smooth value updates and (ii) the \emph{optimistic-follow-the-regularized-leader} algorithm with the log barrier regularizer.

APA

Cai, Y., Luo, H., Wei, C. & Zheng, W.. (2024). Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:3889-3897 Available from https://proceedings.mlr.press/v238/cai24a.html.

Related Material

Download PDF