Vortices Instead of Equilibria in MinMax Optimization: Chaos and Butterfly Effects of Online Learning in ZeroSum Games
[edit]
Proceedings of the ThirtySecond Conference on Learning Theory, PMLR 99:807834, 2019.
Abstract
We establish that algorithmic experiments in zerosum games “fail miserably” to confirm the unique, sharp prediction of maxmin equilibration. Contradicting nearly a century of economic thought that treats zerosum games nearly axiomatically as the exemplar symbol of economic stability, we prove that no meaningful prediction can be made about the daytoday behavior of online learning dynamics in zerosum games. Concretely, Multiplicative Weights Updates (MWU) with constant stepsize is \emph{Lyapunov chaotic} in the dual (payoff) space. Simply put, let’s assume that an observer asks the agents playing MatchingPennies whether they prefer Heads or Tails (and by how much in terms of aggregate payoff so far). The range of possible answers consistent with any arbitrary small set of initial conditions blows up exponentially with time everywhere in the payoff space. This result is \emph{robust} both \emph{algorithmically} as well as \emph{game theoretically}. \textbf{Algorithmic robustness:} Chaos is robust to agents using any of a general subfamily of FollowtheRegularizedLeader (FTRL) algorithms, the well known regretminimizing dynamics, even when agents mixandmatch dynamics, use different or slowly decreasing stepsizes. \textbf{Game theoretic robustness:} Chaos is robust to all affine variants of zerosum games (strictly competitive games), network variants with arbitrary large number of agents and even to competitive settings beyond these. Our result is in stark contrast with the timeaverage convergence of online learning to (approximate) Nash equilibrium, a result widely reported as “(weak) convergence to equilibrium”.
Related Material


