Learning Nash Equilibrium for GeneralSum Markov Games from Batch Data
[edit]
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR 54:232241, 2017.
Abstract
This paper addresses the problem of learning a Nash equilibrium in $γ$discounted multiplayer generalsum Markov Games (MGs) in a batch setting. As the number of players increases in MG, the agents may either collaborate or team apart to increase their final rewards. One solution to address this problem is to look for a Nash equilibrium. Although, several techniques were found for the subcase of twoplayer zerosum MGs, those techniques fail to find a Nash equilibrium in generalsum Markov Games. In this paper, we introduce a new definition of $ε$Nash equilibrium in MGs which grasps the strategy’s quality for multiplayer games. We prove that minimizing the norm of two Bellmanlike residuals implies to learn such an $ε$Nash equilibrium. Then, we show that minimizing an empirical estimate of the $L_p$ norm of these Bellmanlike residuals allows learning for generalsum games within the batch setting. Finally, we introduce a neural network architecture that successfully learns a Nash equilibrium in generic multiplayer generalsum turnbased MGs.
Related Material


