Preference-CFR: Beyond Nash Equilibrium for Better Game Strategies

Qi Ju, Thomas Tellier, Meng Sun, Zhemei Fang, Yunfeng Luo
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:28463-28480, 2025.

Abstract

Artificial intelligence (AI) has surpassed top human players in a variety of games. In imperfect information games, these achievements have primarily been driven by Counterfactual Regret Minimization (CFR) and its variants for computing Nash equilibrium. However, most existing research has focused on maximizing payoff, while largely neglecting the importance of strategic diversity and the need for varied play styles, thereby limiting AI’s adaptability to different user preferences. To address this gap, we propose Preference-CFR (Pref-CFR), a novel method that incorporates two key parameters: preference degree and vulnerability degree. These parameters enable the AI to adjust its strategic distribution within an acceptable performance loss threshold, thereby enhancing its adaptability to a wider range of strategic demands. In our experiments with Texas Hold’em, Pref-CFR successfully trained Aggressive and Loose Passive styles that not only match original CFR-based strategies in performance but also display clearly distinct behavioral patterns. Notably, for certain hand scenarios, Pref-CFR produces strategies that diverge significantly from both conventional expert heuristics and original CFR outputs, potentially offering novel insights for professional players.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-ju25b, title = {Preference-{CFR}: Beyond {N}ash Equilibrium for Better Game Strategies}, author = {Ju, Qi and Tellier, Thomas and Sun, Meng and Fang, Zhemei and Luo, Yunfeng}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {28463--28480}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/ju25b/ju25b.pdf}, url = {https://proceedings.mlr.press/v267/ju25b.html}, abstract = {Artificial intelligence (AI) has surpassed top human players in a variety of games. In imperfect information games, these achievements have primarily been driven by Counterfactual Regret Minimization (CFR) and its variants for computing Nash equilibrium. However, most existing research has focused on maximizing payoff, while largely neglecting the importance of strategic diversity and the need for varied play styles, thereby limiting AI’s adaptability to different user preferences. To address this gap, we propose Preference-CFR (Pref-CFR), a novel method that incorporates two key parameters: preference degree and vulnerability degree. These parameters enable the AI to adjust its strategic distribution within an acceptable performance loss threshold, thereby enhancing its adaptability to a wider range of strategic demands. In our experiments with Texas Hold’em, Pref-CFR successfully trained Aggressive and Loose Passive styles that not only match original CFR-based strategies in performance but also display clearly distinct behavioral patterns. Notably, for certain hand scenarios, Pref-CFR produces strategies that diverge significantly from both conventional expert heuristics and original CFR outputs, potentially offering novel insights for professional players.} }
Endnote
%0 Conference Paper %T Preference-CFR: Beyond Nash Equilibrium for Better Game Strategies %A Qi Ju %A Thomas Tellier %A Meng Sun %A Zhemei Fang %A Yunfeng Luo %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-ju25b %I PMLR %P 28463--28480 %U https://proceedings.mlr.press/v267/ju25b.html %V 267 %X Artificial intelligence (AI) has surpassed top human players in a variety of games. In imperfect information games, these achievements have primarily been driven by Counterfactual Regret Minimization (CFR) and its variants for computing Nash equilibrium. However, most existing research has focused on maximizing payoff, while largely neglecting the importance of strategic diversity and the need for varied play styles, thereby limiting AI’s adaptability to different user preferences. To address this gap, we propose Preference-CFR (Pref-CFR), a novel method that incorporates two key parameters: preference degree and vulnerability degree. These parameters enable the AI to adjust its strategic distribution within an acceptable performance loss threshold, thereby enhancing its adaptability to a wider range of strategic demands. In our experiments with Texas Hold’em, Pref-CFR successfully trained Aggressive and Loose Passive styles that not only match original CFR-based strategies in performance but also display clearly distinct behavioral patterns. Notably, for certain hand scenarios, Pref-CFR produces strategies that diverge significantly from both conventional expert heuristics and original CFR outputs, potentially offering novel insights for professional players.
APA
Ju, Q., Tellier, T., Sun, M., Fang, Z. & Luo, Y.. (2025). Preference-CFR: Beyond Nash Equilibrium for Better Game Strategies. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:28463-28480 Available from https://proceedings.mlr.press/v267/ju25b.html.

Related Material