[edit]
Open problem: Convergence of single-timescale mean-field Langevin descent-ascent for two-player zero-sum games
Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:5345-5350, 2024.
Abstract
Let a smooth function $f: T^d \times T^d \to \mathbb{R}$ over the $d$-torus and $\beta>0$. Consider the min-max objective functional $F_\beta(\mu, \nu) = \iint f d\mu d\nu + \beta^{-1} H(\mu) - \beta^{-1} H(\nu)$ over $\mathcal{P}(T^d) \times \mathcal{P}(T^d)$, where $H$ denotes the negative differential entropy. Its unique saddle point defines the entropy-regularized mixed Nash equilibrium of a two-player zero-sum game, and its Wasserstein gradient descent-ascent flow $(\mu_t, \nu_t)$ corresponds to the mean-field limit of a Langevin descent-ascent dynamics. Do $\mu_t$ and $\nu_t$ converge (weakly, say) as $t \to \infty$, for any $f$ and $\beta$? This rather natural qualitative question is still open, and it is not clear whether it can be addressed using the tools currently available for the analysis of dynamics in Wasserstein space. Even though the simple trick of using a different timescale for the ascent versus the descent is known to guarantee convergence, we propose this question as a toy setting to further our understanding of the Wasserstein geometry for optimization.