Faster Single-loop Algorithms for Minimax Optimization without Strong Concavity

Junchi Yang, Antonio Orvieto, Aurelien Lucchi, Niao He
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:5485-5517, 2022.

Abstract

Gradient descent ascent (GDA), the simplest single-loop algorithm for nonconvex minimax optimization, is widely used in practical applications such as generative adversarial networks (GANs) and adversarial training. Albeit its desirable simplicity, recent work shows inferior convergence rates of GDA in theory, even when assuming strong concavity of the objective in terms of one variable. This paper establishes new convergence results for two alternative single-loop algorithms – alternating GDA and smoothed GDA – under the mild assumption that the objective satisfies the Polyak-Lojasiewicz (PL) condition about one variable. We prove that, to find an $\epsilon$-stationary point, (i) alternating GDA and its stochastic variant (without mini batch) respectively require $O(\kappa^{2} \epsilon^{-2})$ and $O(\kappa^{4} \epsilon^{-4})$ iterations, while (ii) smoothed GDA and its stochastic variant (without mini batch) respectively require $O(\kappa \epsilon^{-2})$ and $O(\kappa^{2} \epsilon^{-4})$ iterations. The latter greatly improves over the vanilla GDA and gives the hitherto best known complexity results among single-loop algorithms under similar settings. We further showcase the empirical efficiency of these algorithms in training GANs and robust nonlinear regression.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-yang22b, title = { Faster Single-loop Algorithms for Minimax Optimization without Strong Concavity }, author = {Yang, Junchi and Orvieto, Antonio and Lucchi, Aurelien and He, Niao}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {5485--5517}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/yang22b/yang22b.pdf}, url = {https://proceedings.mlr.press/v151/yang22b.html}, abstract = { Gradient descent ascent (GDA), the simplest single-loop algorithm for nonconvex minimax optimization, is widely used in practical applications such as generative adversarial networks (GANs) and adversarial training. Albeit its desirable simplicity, recent work shows inferior convergence rates of GDA in theory, even when assuming strong concavity of the objective in terms of one variable. This paper establishes new convergence results for two alternative single-loop algorithms – alternating GDA and smoothed GDA – under the mild assumption that the objective satisfies the Polyak-Lojasiewicz (PL) condition about one variable. We prove that, to find an $\epsilon$-stationary point, (i) alternating GDA and its stochastic variant (without mini batch) respectively require $O(\kappa^{2} \epsilon^{-2})$ and $O(\kappa^{4} \epsilon^{-4})$ iterations, while (ii) smoothed GDA and its stochastic variant (without mini batch) respectively require $O(\kappa \epsilon^{-2})$ and $O(\kappa^{2} \epsilon^{-4})$ iterations. The latter greatly improves over the vanilla GDA and gives the hitherto best known complexity results among single-loop algorithms under similar settings. We further showcase the empirical efficiency of these algorithms in training GANs and robust nonlinear regression. } }
Endnote
%0 Conference Paper %T Faster Single-loop Algorithms for Minimax Optimization without Strong Concavity %A Junchi Yang %A Antonio Orvieto %A Aurelien Lucchi %A Niao He %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-yang22b %I PMLR %P 5485--5517 %U https://proceedings.mlr.press/v151/yang22b.html %V 151 %X Gradient descent ascent (GDA), the simplest single-loop algorithm for nonconvex minimax optimization, is widely used in practical applications such as generative adversarial networks (GANs) and adversarial training. Albeit its desirable simplicity, recent work shows inferior convergence rates of GDA in theory, even when assuming strong concavity of the objective in terms of one variable. This paper establishes new convergence results for two alternative single-loop algorithms – alternating GDA and smoothed GDA – under the mild assumption that the objective satisfies the Polyak-Lojasiewicz (PL) condition about one variable. We prove that, to find an $\epsilon$-stationary point, (i) alternating GDA and its stochastic variant (without mini batch) respectively require $O(\kappa^{2} \epsilon^{-2})$ and $O(\kappa^{4} \epsilon^{-4})$ iterations, while (ii) smoothed GDA and its stochastic variant (without mini batch) respectively require $O(\kappa \epsilon^{-2})$ and $O(\kappa^{2} \epsilon^{-4})$ iterations. The latter greatly improves over the vanilla GDA and gives the hitherto best known complexity results among single-loop algorithms under similar settings. We further showcase the empirical efficiency of these algorithms in training GANs and robust nonlinear regression.
APA
Yang, J., Orvieto, A., Lucchi, A. & He, N.. (2022). Faster Single-loop Algorithms for Minimax Optimization without Strong Concavity . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:5485-5517 Available from https://proceedings.mlr.press/v151/yang22b.html.

Related Material