Last-Iterate Convergence Rates for Min-Max Optimization: Convergence of Hamiltonian Gradient Descent and Consensus Optimization

Jacob Abernethy, Kevin A. Lai, Andre Wibisono
Proceedings of the 32nd International Conference on Algorithmic Learning Theory, PMLR 132:3-47, 2021.

Abstract

While classic work in convex-concave min-max optimization relies on average-iterate convergence results, the emergence of nonconvex applications such as training Generative Adversarial Networks has led to renewed interest in last-iterate convergence guarantees. Proving last-iterate convergence is challenging because many natural algorithms, such as Simultaneous Gradient Descent/Ascent, provably diverge or cycle even in simple convex-concave min-max settings, and there are relatively few papers that prove global last-iterate convergence rates beyond the bilinear and convex-strongly concave settings. In this work, we show that the Hamiltonian Gradient Descent (HGD) algorithm achieves linear convergence in a variety of more general settings, including convex-concave problems that satisfy a "sufficiently bilinear" condition. We also prove convergence rates for stochastic HGD and for some parameter settings of the Consensus Optimization algorithm of Mescheder et al. (2017).

Cite this Paper


BibTeX
@InProceedings{pmlr-v132-abernethy21a, title = {Last-Iterate Convergence Rates for Min-Max Optimization: Convergence of Hamiltonian Gradient Descent and Consensus Optimization}, author = {Abernethy, Jacob and Lai, Kevin A. and Wibisono, Andre}, booktitle = {Proceedings of the 32nd International Conference on Algorithmic Learning Theory}, pages = {3--47}, year = {2021}, editor = {Feldman, Vitaly and Ligett, Katrina and Sabato, Sivan}, volume = {132}, series = {Proceedings of Machine Learning Research}, month = {16--19 Mar}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v132/abernethy21a/abernethy21a.pdf}, url = {https://proceedings.mlr.press/v132/abernethy21a.html}, abstract = {While classic work in convex-concave min-max optimization relies on average-iterate convergence results, the emergence of nonconvex applications such as training Generative Adversarial Networks has led to renewed interest in last-iterate convergence guarantees. Proving last-iterate convergence is challenging because many natural algorithms, such as Simultaneous Gradient Descent/Ascent, provably diverge or cycle even in simple convex-concave min-max settings, and there are relatively few papers that prove global last-iterate convergence rates beyond the bilinear and convex-strongly concave settings. In this work, we show that the Hamiltonian Gradient Descent (HGD) algorithm achieves linear convergence in a variety of more general settings, including convex-concave problems that satisfy a "sufficiently bilinear" condition. We also prove convergence rates for stochastic HGD and for some parameter settings of the Consensus Optimization algorithm of Mescheder et al. (2017).} }
Endnote
%0 Conference Paper %T Last-Iterate Convergence Rates for Min-Max Optimization: Convergence of Hamiltonian Gradient Descent and Consensus Optimization %A Jacob Abernethy %A Kevin A. Lai %A Andre Wibisono %B Proceedings of the 32nd International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2021 %E Vitaly Feldman %E Katrina Ligett %E Sivan Sabato %F pmlr-v132-abernethy21a %I PMLR %P 3--47 %U https://proceedings.mlr.press/v132/abernethy21a.html %V 132 %X While classic work in convex-concave min-max optimization relies on average-iterate convergence results, the emergence of nonconvex applications such as training Generative Adversarial Networks has led to renewed interest in last-iterate convergence guarantees. Proving last-iterate convergence is challenging because many natural algorithms, such as Simultaneous Gradient Descent/Ascent, provably diverge or cycle even in simple convex-concave min-max settings, and there are relatively few papers that prove global last-iterate convergence rates beyond the bilinear and convex-strongly concave settings. In this work, we show that the Hamiltonian Gradient Descent (HGD) algorithm achieves linear convergence in a variety of more general settings, including convex-concave problems that satisfy a "sufficiently bilinear" condition. We also prove convergence rates for stochastic HGD and for some parameter settings of the Consensus Optimization algorithm of Mescheder et al. (2017).
APA
Abernethy, J., Lai, K.A. & Wibisono, A.. (2021). Last-Iterate Convergence Rates for Min-Max Optimization: Convergence of Hamiltonian Gradient Descent and Consensus Optimization. Proceedings of the 32nd International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 132:3-47 Available from https://proceedings.mlr.press/v132/abernethy21a.html.

Related Material