First-Order Algorithms Converge Faster than $O(1/k)$ on Convex Problems

Ching-Pei Lee; Stephen Wright

First-Order Algorithms Converge Faster than $O(1/k)$ on Convex Problems

Ching-Pei Lee, Stephen Wright

Proceedings of the 36th International Conference on Machine Learning, PMLR 97:3754-3762, 2019.

Abstract

It is well known that both gradient descent and stochastic coordinate descent achieve a global convergence rate of $O(1/k)$ in the objective value, when applied to a scheme for minimizing a Lipschitz-continuously differentiable, unconstrained convex function. In this work, we improve this rate to $o(1/k)$. We extend the result to proximal gradient and proximal coordinate descent on regularized problems to show similar $o(1/k)$ convergence rates. The result is tight in the sense that a rate of $O(1/k^{1+\epsilon})$ is not generally attainable for any $\epsilon>0$, for any of these methods.

Cite this Paper

BibTeX

@InProceedings{pmlr-v97-lee19e,
  title = 	 {First-Order Algorithms Converge Faster than $O(1/k)$ on Convex Problems},
  author =       {Lee, Ching-Pei and Wright, Stephen},
  booktitle = 	 {Proceedings of the 36th International Conference on Machine Learning},
  pages = 	 {3754--3762},
  year = 	 {2019},
  editor = 	 {Chaudhuri, Kamalika and Salakhutdinov, Ruslan},
  volume = 	 {97},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--15 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v97/lee19e/lee19e.pdf},
  url = 	 {https://proceedings.mlr.press/v97/lee19e.html},
  abstract = 	 {It is well known that both gradient descent and stochastic coordinate descent achieve a global convergence rate of $O(1/k)$ in the objective value, when applied to a scheme for minimizing a Lipschitz-continuously differentiable, unconstrained convex function. In this work, we improve this rate to $o(1/k)$. We extend the result to proximal gradient and proximal coordinate descent on regularized problems to show similar $o(1/k)$ convergence rates. The result is tight in the sense that a rate of $O(1/k^{1+\epsilon})$ is not generally attainable for any $\epsilon>0$, for any of these methods.}
}

Endnote

%0 Conference Paper
%T First-Order Algorithms Converge Faster than $O(1/k)$ on Convex Problems
%A Ching-Pei Lee
%A Stephen Wright
%B Proceedings of the 36th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2019
%E Kamalika Chaudhuri
%E Ruslan Salakhutdinov	
%F pmlr-v97-lee19e
%I PMLR
%P 3754--3762
%U https://proceedings.mlr.press/v97/lee19e.html
%V 97
%X It is well known that both gradient descent and stochastic coordinate descent achieve a global convergence rate of $O(1/k)$ in the objective value, when applied to a scheme for minimizing a Lipschitz-continuously differentiable, unconstrained convex function. In this work, we improve this rate to $o(1/k)$. We extend the result to proximal gradient and proximal coordinate descent on regularized problems to show similar $o(1/k)$ convergence rates. The result is tight in the sense that a rate of $O(1/k^{1+\epsilon})$ is not generally attainable for any $\epsilon>0$, for any of these methods.

APA

Lee, C. & Wright, S.. (2019). First-Order Algorithms Converge Faster than $O(1/k)$ on Convex Problems. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:3754-3762 Available from https://proceedings.mlr.press/v97/lee19e.html.

First-Order Algorithms Converge Faster than $O(1/k)$ on Convex Problems

Abstract

Cite this Paper

Related Material