Decoding Rewards in Competitive Games: Inverse Game Theory with Entropy Regularization

Junyi Liao, Zihan Zhu, Ethan X Fang, Zhuoran Yang, Vahid Tarokh
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:37610-37622, 2025.

Abstract

Estimating the unknown reward functions driving agents’ behavior is a central challenge in inverse games and reinforcement learning. This paper introduces a unified framework for reward function recovery in two-player zero-sum matrix games and Markov games with entropy regularization. Given observed player strategies and actions, we aim to reconstruct the underlying reward functions. This task is challenging due to the inherent ambiguity of inverse problems, the non-uniqueness of feasible rewards, and limited observational data coverage. To address these challenges, we establish reward function identifiability using the quantal response equilibrium (QRE) under linear assumptions. Building on this theoretical foundation, we propose an algorithm to learn reward from observed actions, designed to capture all plausible reward parameters by constructing confidence sets. Our algorithm works in both static and dynamic settings and is adaptable to incorporate other methods, such as Maximum Likelihood Estimation (MLE). We provide strong theoretical guarantees for the reliability and sample-efficiency of our algorithm. Empirical results demonstrate the framework’s effectiveness in accurately recovering reward functions across various scenarios, offering new insights into decision-making in competitive environments.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-liao25i, title = {Decoding Rewards in Competitive Games: Inverse Game Theory with Entropy Regularization}, author = {Liao, Junyi and Zhu, Zihan and Fang, Ethan X and Yang, Zhuoran and Tarokh, Vahid}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {37610--37622}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/liao25i/liao25i.pdf}, url = {https://proceedings.mlr.press/v267/liao25i.html}, abstract = {Estimating the unknown reward functions driving agents’ behavior is a central challenge in inverse games and reinforcement learning. This paper introduces a unified framework for reward function recovery in two-player zero-sum matrix games and Markov games with entropy regularization. Given observed player strategies and actions, we aim to reconstruct the underlying reward functions. This task is challenging due to the inherent ambiguity of inverse problems, the non-uniqueness of feasible rewards, and limited observational data coverage. To address these challenges, we establish reward function identifiability using the quantal response equilibrium (QRE) under linear assumptions. Building on this theoretical foundation, we propose an algorithm to learn reward from observed actions, designed to capture all plausible reward parameters by constructing confidence sets. Our algorithm works in both static and dynamic settings and is adaptable to incorporate other methods, such as Maximum Likelihood Estimation (MLE). We provide strong theoretical guarantees for the reliability and sample-efficiency of our algorithm. Empirical results demonstrate the framework’s effectiveness in accurately recovering reward functions across various scenarios, offering new insights into decision-making in competitive environments.} }
Endnote
%0 Conference Paper %T Decoding Rewards in Competitive Games: Inverse Game Theory with Entropy Regularization %A Junyi Liao %A Zihan Zhu %A Ethan X Fang %A Zhuoran Yang %A Vahid Tarokh %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-liao25i %I PMLR %P 37610--37622 %U https://proceedings.mlr.press/v267/liao25i.html %V 267 %X Estimating the unknown reward functions driving agents’ behavior is a central challenge in inverse games and reinforcement learning. This paper introduces a unified framework for reward function recovery in two-player zero-sum matrix games and Markov games with entropy regularization. Given observed player strategies and actions, we aim to reconstruct the underlying reward functions. This task is challenging due to the inherent ambiguity of inverse problems, the non-uniqueness of feasible rewards, and limited observational data coverage. To address these challenges, we establish reward function identifiability using the quantal response equilibrium (QRE) under linear assumptions. Building on this theoretical foundation, we propose an algorithm to learn reward from observed actions, designed to capture all plausible reward parameters by constructing confidence sets. Our algorithm works in both static and dynamic settings and is adaptable to incorporate other methods, such as Maximum Likelihood Estimation (MLE). We provide strong theoretical guarantees for the reliability and sample-efficiency of our algorithm. Empirical results demonstrate the framework’s effectiveness in accurately recovering reward functions across various scenarios, offering new insights into decision-making in competitive environments.
APA
Liao, J., Zhu, Z., Fang, E.X., Yang, Z. & Tarokh, V.. (2025). Decoding Rewards in Competitive Games: Inverse Game Theory with Entropy Regularization. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:37610-37622 Available from https://proceedings.mlr.press/v267/liao25i.html.

Related Material