Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent

Weiming Liu; Huacong Jiang; Bin Li; Houqiang Li

Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent

Weiming Liu, Huacong Jiang, Bin Li, Houqiang Li

Proceedings of the 39th International Conference on Machine Learning, PMLR 162:13717-13745, 2022.

Abstract

Follow-the-Regularized-Leader (FTRL) and Online Mirror Descent (OMD) are regret minimization algorithms for Online Convex Optimization (OCO), they are mathematically elegant but less practical in solving Extensive-Form Games (EFGs). Counterfactual Regret Minimization (CFR) is a technique for approximating Nash equilibria in EFGs. CFR and its variants have a fast convergence rate in practice, but their theoretical results are not satisfactory. In recent years, researchers have been trying to link CFRs with OCO algorithms, which may provide new theoretical results and inspire new algorithms. However, existing analysis is restricted to local decision points. In this paper, we show that CFRs with Regret Matching and Regret Matching+ are equivalent to special cases of FTRL and OMD, respectively. According to these equivalences, a new FTRL and a new OMD algorithm, which can be considered as extensions of vanilla CFR and CFR+, are derived. The experimental results show that the two variants converge faster than conventional FTRL and OMD, even faster than vanilla CFR and CFR+ in some EFGs.

Cite this Paper

BibTeX


@InProceedings{pmlr-v162-liu22e,
  title = 	 {Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent},
  author =       {Liu, Weiming and Jiang, Huacong and Li, Bin and Li, Houqiang},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {13717--13745},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/liu22e/liu22e.pdf},
  url = 	 {https://proceedings.mlr.press/v162/liu22e.html},
  abstract = 	 {Follow-the-Regularized-Leader (FTRL) and Online Mirror Descent (OMD) are regret minimization algorithms for Online Convex Optimization (OCO), they are mathematically elegant but less practical in solving Extensive-Form Games (EFGs). Counterfactual Regret Minimization (CFR) is a technique for approximating Nash equilibria in EFGs. CFR and its variants have a fast convergence rate in practice, but their theoretical results are not satisfactory. In recent years, researchers have been trying to link CFRs with OCO algorithms, which may provide new theoretical results and inspire new algorithms. However, existing analysis is restricted to local decision points. In this paper, we show that CFRs with Regret Matching and Regret Matching+ are equivalent to special cases of FTRL and OMD, respectively. According to these equivalences, a new FTRL and a new OMD algorithm, which can be considered as extensions of vanilla CFR and CFR+, are derived. The experimental results show that the two variants converge faster than conventional FTRL and OMD, even faster than vanilla CFR and CFR+ in some EFGs.}
}

Endnote

%0 Conference Paper
%T Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent
%A Weiming Liu
%A Huacong Jiang
%A Bin Li
%A Houqiang Li
%B Proceedings of the 39th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Kamalika Chaudhuri
%E Stefanie Jegelka
%E Le Song
%E Csaba Szepesvari
%E Gang Niu
%E Sivan Sabato	
%F pmlr-v162-liu22e
%I PMLR
%P 13717--13745
%U https://proceedings.mlr.press/v162/liu22e.html
%V 162
%X Follow-the-Regularized-Leader (FTRL) and Online Mirror Descent (OMD) are regret minimization algorithms for Online Convex Optimization (OCO), they are mathematically elegant but less practical in solving Extensive-Form Games (EFGs). Counterfactual Regret Minimization (CFR) is a technique for approximating Nash equilibria in EFGs. CFR and its variants have a fast convergence rate in practice, but their theoretical results are not satisfactory. In recent years, researchers have been trying to link CFRs with OCO algorithms, which may provide new theoretical results and inspire new algorithms. However, existing analysis is restricted to local decision points. In this paper, we show that CFRs with Regret Matching and Regret Matching+ are equivalent to special cases of FTRL and OMD, respectively. According to these equivalences, a new FTRL and a new OMD algorithm, which can be considered as extensions of vanilla CFR and CFR+, are derived. The experimental results show that the two variants converge faster than conventional FTRL and OMD, even faster than vanilla CFR and CFR+ in some EFGs.

APA


Liu, W., Jiang, H., Li, B. & Li, H.. (2022). Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:13717-13745 Available from https://proceedings.mlr.press/v162/liu22e.html.

Related Material

Download PDF