Stochastic Regret Minimization in Extensive-Form Games

Gabriele Farina, Christian Kroer, Tuomas Sandholm
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:3018-3028, 2020.

Abstract

Monte-Carlo counterfactual regret minimization (MCCFR) is the state-of-the-art algorithm for solving sequential games that are too large for full tree traversals. It works by using gradient estimates that can be computed via sampling. However, stochastic methods for sequential games have not been investigated extensively beyond MCCFR. In this paper we develop a new framework for developing stochastic regret minimization methods. This framework allows us to use any regret-minimization algorithm, coupled with any gradient estimator. The MCCFR algorithm can be analyzed as a special case of our framework, and this analysis leads to significantly stronger theoretical guarantees on convergence, while simultaneously yielding a simplified proof. Our framework allows us to instantiate several new stochastic methods for solving sequential games. We show extensive experiments on five games, where some variants of our methods outperform MCCFR.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-farina20a, title = {Stochastic Regret Minimization in Extensive-Form Games}, author = {Farina, Gabriele and Kroer, Christian and Sandholm, Tuomas}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {3018--3028}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/farina20a/farina20a.pdf}, url = {https://proceedings.mlr.press/v119/farina20a.html}, abstract = {Monte-Carlo counterfactual regret minimization (MCCFR) is the state-of-the-art algorithm for solving sequential games that are too large for full tree traversals. It works by using gradient estimates that can be computed via sampling. However, stochastic methods for sequential games have not been investigated extensively beyond MCCFR. In this paper we develop a new framework for developing stochastic regret minimization methods. This framework allows us to use any regret-minimization algorithm, coupled with any gradient estimator. The MCCFR algorithm can be analyzed as a special case of our framework, and this analysis leads to significantly stronger theoretical guarantees on convergence, while simultaneously yielding a simplified proof. Our framework allows us to instantiate several new stochastic methods for solving sequential games. We show extensive experiments on five games, where some variants of our methods outperform MCCFR.} }
Endnote
%0 Conference Paper %T Stochastic Regret Minimization in Extensive-Form Games %A Gabriele Farina %A Christian Kroer %A Tuomas Sandholm %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-farina20a %I PMLR %P 3018--3028 %U https://proceedings.mlr.press/v119/farina20a.html %V 119 %X Monte-Carlo counterfactual regret minimization (MCCFR) is the state-of-the-art algorithm for solving sequential games that are too large for full tree traversals. It works by using gradient estimates that can be computed via sampling. However, stochastic methods for sequential games have not been investigated extensively beyond MCCFR. In this paper we develop a new framework for developing stochastic regret minimization methods. This framework allows us to use any regret-minimization algorithm, coupled with any gradient estimator. The MCCFR algorithm can be analyzed as a special case of our framework, and this analysis leads to significantly stronger theoretical guarantees on convergence, while simultaneously yielding a simplified proof. Our framework allows us to instantiate several new stochastic methods for solving sequential games. We show extensive experiments on five games, where some variants of our methods outperform MCCFR.
APA
Farina, G., Kroer, C. & Sandholm, T.. (2020). Stochastic Regret Minimization in Extensive-Form Games. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:3018-3028 Available from https://proceedings.mlr.press/v119/farina20a.html.

Related Material