[edit]
Open Problem: Optimal Instance-Dependent Sample Complexity for finding Nash Equilibrium in Two Player Zero-Sum Matrix games
Proceedings of Thirty Eighth Conference on Learning Theory, PMLR 291:6230-6234, 2025.
Abstract
Optimal instance-dependent sample complexity is a well-studied topic in the multi-armed bandit literature. However, the analogous question in the setting of two-player zero-sum matrix games, where the payoff matrix can only be accessed through noisy samples, remains largely unexplored despite being a natural generalization of the multi-armed bandit problem. In this write-up, we pose a simple open question: What is the optimal instance-dependent sample complexity to find an approximate Nash equilibrium in two-player zero-sum matrix games?