Limiting Behaviors of Nonconvex-Nonconcave Minimax Optimization via Continuous-Time Systems

Benjamin Grimmer, Haihao Lu, Pratik Worah, Vahab Mirrokni
Proceedings of The 33rd International Conference on Algorithmic Learning Theory, PMLR 167:465-487, 2022.

Abstract

Unlike nonconvex optimization, where gradient descent is guaranteed to converge to a local optimizer, algorithms for nonconvex-nonconcave minimax optimization can have topologically different solution paths: sometimes converging to a solution, sometimes never converging and instead following a limit cycle, and sometimes diverging. In this paper, we study the limiting behaviors of three classic minimax algorithms: gradient descent ascent (GDA), alternating gradient descent ascent (AGDA), and the extragradient method (EGM). Numerically, we observe that all of these limiting behaviors can arise in Generative Adversarial Networks (GAN) training and are easily demonstrated even in simple GAN models. To explain these different behaviors, we study the high-order resolution continuous-time dynamics that correspond to each algorithm, which results in sufficient (and almost necessary) conditions for the local convergence by each method. Moreover, this ODE perspective allows us to characterize the phase transition between these potentially nonconvergent limiting behaviors caused by introducing regularization in the problem instance.

Cite this Paper


BibTeX
@InProceedings{pmlr-v167-grimmer22a, title = {Limiting Behaviors of Nonconvex-Nonconcave Minimax Optimization via Continuous-Time Systems}, author = {Grimmer, Benjamin and Lu, Haihao and Worah, Pratik and Mirrokni, Vahab}, booktitle = {Proceedings of The 33rd International Conference on Algorithmic Learning Theory}, pages = {465--487}, year = {2022}, editor = {Dasgupta, Sanjoy and Haghtalab, Nika}, volume = {167}, series = {Proceedings of Machine Learning Research}, month = {29 Mar--01 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v167/grimmer22a/grimmer22a.pdf}, url = {https://proceedings.mlr.press/v167/grimmer22a.html}, abstract = {Unlike nonconvex optimization, where gradient descent is guaranteed to converge to a local optimizer, algorithms for nonconvex-nonconcave minimax optimization can have topologically different solution paths: sometimes converging to a solution, sometimes never converging and instead following a limit cycle, and sometimes diverging. In this paper, we study the limiting behaviors of three classic minimax algorithms: gradient descent ascent (GDA), alternating gradient descent ascent (AGDA), and the extragradient method (EGM). Numerically, we observe that all of these limiting behaviors can arise in Generative Adversarial Networks (GAN) training and are easily demonstrated even in simple GAN models. To explain these different behaviors, we study the high-order resolution continuous-time dynamics that correspond to each algorithm, which results in sufficient (and almost necessary) conditions for the local convergence by each method. Moreover, this ODE perspective allows us to characterize the phase transition between these potentially nonconvergent limiting behaviors caused by introducing regularization in the problem instance. } }
Endnote
%0 Conference Paper %T Limiting Behaviors of Nonconvex-Nonconcave Minimax Optimization via Continuous-Time Systems %A Benjamin Grimmer %A Haihao Lu %A Pratik Worah %A Vahab Mirrokni %B Proceedings of The 33rd International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2022 %E Sanjoy Dasgupta %E Nika Haghtalab %F pmlr-v167-grimmer22a %I PMLR %P 465--487 %U https://proceedings.mlr.press/v167/grimmer22a.html %V 167 %X Unlike nonconvex optimization, where gradient descent is guaranteed to converge to a local optimizer, algorithms for nonconvex-nonconcave minimax optimization can have topologically different solution paths: sometimes converging to a solution, sometimes never converging and instead following a limit cycle, and sometimes diverging. In this paper, we study the limiting behaviors of three classic minimax algorithms: gradient descent ascent (GDA), alternating gradient descent ascent (AGDA), and the extragradient method (EGM). Numerically, we observe that all of these limiting behaviors can arise in Generative Adversarial Networks (GAN) training and are easily demonstrated even in simple GAN models. To explain these different behaviors, we study the high-order resolution continuous-time dynamics that correspond to each algorithm, which results in sufficient (and almost necessary) conditions for the local convergence by each method. Moreover, this ODE perspective allows us to characterize the phase transition between these potentially nonconvergent limiting behaviors caused by introducing regularization in the problem instance.
APA
Grimmer, B., Lu, H., Worah, P. & Mirrokni, V.. (2022). Limiting Behaviors of Nonconvex-Nonconcave Minimax Optimization via Continuous-Time Systems. Proceedings of The 33rd International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 167:465-487 Available from https://proceedings.mlr.press/v167/grimmer22a.html.

Related Material