Unifying Nesterov’s Accelerated Gradient Methods for Convex and Strongly Convex Objective Functions

Jungbin Kim, Insoon Yang
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:16897-16954, 2023.

Abstract

Although Nesterov’s accelerated gradient method (AGM) has been studied from various perspectives, it remains unclear why the most popular forms of AGMs must handle convex and strongly convex objective functions separately. To address this inconsistency, we propose a novel unified framework for Lagrangians, ordinary differential equation (ODE) models, and algorithms. As a special case, our new simple momentum algorithm, which we call the unified AGM, seamlessly bridges the gap between the two most popular forms of Nesterov’s AGM and has a superior convergence guarantee compared to existing algorithms for non-strongly convex objective functions. This property is beneficial in practice when considering ill-conditioned $\mu$-strongly convex objective functions (with small $\mu$). Furthermore, we generalize this algorithm and the corresponding ODE model to the higher-order non-Euclidean setting. Last but not least, our unified framework is used to construct the unified AGM-G ODE, a novel ODE model for minimizing the gradient norm of strongly convex functions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-kim23y, title = {Unifying {N}esterov’s Accelerated Gradient Methods for Convex and Strongly Convex Objective Functions}, author = {Kim, Jungbin and Yang, Insoon}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {16897--16954}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/kim23y/kim23y.pdf}, url = {https://proceedings.mlr.press/v202/kim23y.html}, abstract = {Although Nesterov’s accelerated gradient method (AGM) has been studied from various perspectives, it remains unclear why the most popular forms of AGMs must handle convex and strongly convex objective functions separately. To address this inconsistency, we propose a novel unified framework for Lagrangians, ordinary differential equation (ODE) models, and algorithms. As a special case, our new simple momentum algorithm, which we call the unified AGM, seamlessly bridges the gap between the two most popular forms of Nesterov’s AGM and has a superior convergence guarantee compared to existing algorithms for non-strongly convex objective functions. This property is beneficial in practice when considering ill-conditioned $\mu$-strongly convex objective functions (with small $\mu$). Furthermore, we generalize this algorithm and the corresponding ODE model to the higher-order non-Euclidean setting. Last but not least, our unified framework is used to construct the unified AGM-G ODE, a novel ODE model for minimizing the gradient norm of strongly convex functions.} }
Endnote
%0 Conference Paper %T Unifying Nesterov’s Accelerated Gradient Methods for Convex and Strongly Convex Objective Functions %A Jungbin Kim %A Insoon Yang %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-kim23y %I PMLR %P 16897--16954 %U https://proceedings.mlr.press/v202/kim23y.html %V 202 %X Although Nesterov’s accelerated gradient method (AGM) has been studied from various perspectives, it remains unclear why the most popular forms of AGMs must handle convex and strongly convex objective functions separately. To address this inconsistency, we propose a novel unified framework for Lagrangians, ordinary differential equation (ODE) models, and algorithms. As a special case, our new simple momentum algorithm, which we call the unified AGM, seamlessly bridges the gap between the two most popular forms of Nesterov’s AGM and has a superior convergence guarantee compared to existing algorithms for non-strongly convex objective functions. This property is beneficial in practice when considering ill-conditioned $\mu$-strongly convex objective functions (with small $\mu$). Furthermore, we generalize this algorithm and the corresponding ODE model to the higher-order non-Euclidean setting. Last but not least, our unified framework is used to construct the unified AGM-G ODE, a novel ODE model for minimizing the gradient norm of strongly convex functions.
APA
Kim, J. & Yang, I.. (2023). Unifying Nesterov’s Accelerated Gradient Methods for Convex and Strongly Convex Objective Functions. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:16897-16954 Available from https://proceedings.mlr.press/v202/kim23y.html.

Related Material