On a Combination of Alternating Minimization and Nesterov’s Momentum

Sergey Guminov, Pavel Dvurechensky, Nazarii Tupitsa, Alexander Gasnikov
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:3886-3898, 2021.

Abstract

Alternating minimization (AM) procedures are practically efficient in many applications for solving convex and non-convex optimization problems. On the other hand, Nesterov’s accelerated gradient is theoretically optimal first-order method for convex optimization. In this paper we combine AM and Nesterov’s acceleration to propose an accelerated alternating minimization algorithm. We prove $1/k^2$ convergence rate in terms of the objective for convex problems and $1/k$ in terms of the squared gradient norm for non-convex problems, where $k$ is the iteration counter. Our method does not require any knowledge of neither convexity of the problem nor function parameters such as Lipschitz constant of the gradient, i.e. it is adaptive to convexity and smoothness and is uniformly optimal for smooth convex and non-convex problems. Further, we develop its primal-dual modification for strongly convex problems with linear constraints and prove the same $1/k^2$ for the primal objective residual and constraints feasibility.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-guminov21a, title = {On a Combination of Alternating Minimization and Nesterov’s Momentum}, author = {Guminov, Sergey and Dvurechensky, Pavel and Tupitsa, Nazarii and Gasnikov, Alexander}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {3886--3898}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/guminov21a/guminov21a.pdf}, url = {https://proceedings.mlr.press/v139/guminov21a.html}, abstract = {Alternating minimization (AM) procedures are practically efficient in many applications for solving convex and non-convex optimization problems. On the other hand, Nesterov’s accelerated gradient is theoretically optimal first-order method for convex optimization. In this paper we combine AM and Nesterov’s acceleration to propose an accelerated alternating minimization algorithm. We prove $1/k^2$ convergence rate in terms of the objective for convex problems and $1/k$ in terms of the squared gradient norm for non-convex problems, where $k$ is the iteration counter. Our method does not require any knowledge of neither convexity of the problem nor function parameters such as Lipschitz constant of the gradient, i.e. it is adaptive to convexity and smoothness and is uniformly optimal for smooth convex and non-convex problems. Further, we develop its primal-dual modification for strongly convex problems with linear constraints and prove the same $1/k^2$ for the primal objective residual and constraints feasibility.} }
Endnote
%0 Conference Paper %T On a Combination of Alternating Minimization and Nesterov’s Momentum %A Sergey Guminov %A Pavel Dvurechensky %A Nazarii Tupitsa %A Alexander Gasnikov %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-guminov21a %I PMLR %P 3886--3898 %U https://proceedings.mlr.press/v139/guminov21a.html %V 139 %X Alternating minimization (AM) procedures are practically efficient in many applications for solving convex and non-convex optimization problems. On the other hand, Nesterov’s accelerated gradient is theoretically optimal first-order method for convex optimization. In this paper we combine AM and Nesterov’s acceleration to propose an accelerated alternating minimization algorithm. We prove $1/k^2$ convergence rate in terms of the objective for convex problems and $1/k$ in terms of the squared gradient norm for non-convex problems, where $k$ is the iteration counter. Our method does not require any knowledge of neither convexity of the problem nor function parameters such as Lipschitz constant of the gradient, i.e. it is adaptive to convexity and smoothness and is uniformly optimal for smooth convex and non-convex problems. Further, we develop its primal-dual modification for strongly convex problems with linear constraints and prove the same $1/k^2$ for the primal objective residual and constraints feasibility.
APA
Guminov, S., Dvurechensky, P., Tupitsa, N. & Gasnikov, A.. (2021). On a Combination of Alternating Minimization and Nesterov’s Momentum. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:3886-3898 Available from https://proceedings.mlr.press/v139/guminov21a.html.

Related Material