Anderson acceleration of coordinate descent

Quentin Bertrand; Mathurin Massias

Anderson acceleration of coordinate descent

Quentin Bertrand, Mathurin Massias

Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:1288-1296, 2021.

Abstract

Acceleration of first order methods is mainly obtained via inertia à la Nesterov, or via nonlinear extrapolation. The latter has known a recent surge of interest, with successful applications to gradient and proximal gradient techniques. On multiple Machine Learning problems, coordinate descent achieves performance significantly superior to full-gradient methods. Speeding up coordinate descent in practice is not easy: inertially accelerated versions of coordinate descent are theoretically accelerated, but might not always lead to practical speed-ups. We propose an accelerated version of coordinate descent using extrapolation, showing considerable speed up in practice, compared to inertial accelerated coordinate descent and extrapolated (proximal) gradient descent. Experiments on least squares, Lasso, elastic net and logistic regression validate the approach.

Cite this Paper

BibTeX


@InProceedings{pmlr-v130-bertrand21a,
  title = 	 { Anderson acceleration of coordinate descent },
  author =       {Bertrand, Quentin and Massias, Mathurin},
  booktitle = 	 {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1288--1296},
  year = 	 {2021},
  editor = 	 {Banerjee, Arindam and Fukumizu, Kenji},
  volume = 	 {130},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--15 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v130/bertrand21a/bertrand21a.pdf},
  url = 	 {https://proceedings.mlr.press/v130/bertrand21a.html},
  abstract = 	 { Acceleration of first order methods is mainly obtained via inertia à la Nesterov, or via nonlinear extrapolation. The latter has known a recent surge of interest, with successful applications to gradient and proximal gradient techniques. On multiple Machine Learning problems, coordinate descent achieves performance significantly superior to full-gradient methods. Speeding up coordinate descent in practice is not easy: inertially accelerated versions of coordinate descent are theoretically accelerated, but might not always lead to practical speed-ups. We propose an accelerated version of coordinate descent using extrapolation, showing considerable speed up in practice, compared to inertial accelerated coordinate descent and extrapolated (proximal) gradient descent. Experiments on least squares, Lasso, elastic net and logistic regression validate the approach. }
}

Endnote

%0 Conference Paper
%T  Anderson acceleration of coordinate descent 
%A Quentin Bertrand
%A Mathurin Massias
%B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2021
%E Arindam Banerjee
%E Kenji Fukumizu	
%F pmlr-v130-bertrand21a
%I PMLR
%P 1288--1296
%U https://proceedings.mlr.press/v130/bertrand21a.html
%V 130
%X  Acceleration of first order methods is mainly obtained via inertia à la Nesterov, or via nonlinear extrapolation. The latter has known a recent surge of interest, with successful applications to gradient and proximal gradient techniques. On multiple Machine Learning problems, coordinate descent achieves performance significantly superior to full-gradient methods. Speeding up coordinate descent in practice is not easy: inertially accelerated versions of coordinate descent are theoretically accelerated, but might not always lead to practical speed-ups. We propose an accelerated version of coordinate descent using extrapolation, showing considerable speed up in practice, compared to inertial accelerated coordinate descent and extrapolated (proximal) gradient descent. Experiments on least squares, Lasso, elastic net and logistic regression validate the approach.

APA


Bertrand, Q. & Massias, M.. (2021).  Anderson acceleration of coordinate descent . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:1288-1296 Available from https://proceedings.mlr.press/v130/bertrand21a.html.

Anderson acceleration of coordinate descent

Abstract

Cite this Paper

Related Material