A Linearly-Convergent Stochastic L-BFGS Algorithm

Philipp Moritz; Robert Nishihara; Michael Jordan

A Linearly-Convergent Stochastic L-BFGS Algorithm

Philipp Moritz, Robert Nishihara, Michael Jordan

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:249-258, 2016.

Abstract

We propose a new stochastic L-BFGS algorithm and prove a linear convergence rate for strongly convex and smooth functions. Our algorithm draws heavily from a recent stochastic variant of L-BFGS proposed in Byrd et al. (2014) as well as a recent approach to variance reduction for stochastic gradient descent from Johnson and Zhang (2013). We demonstrate experimentally that our algorithm performs well on large-scale convex and non-convex optimization problems, exhibiting linear convergence and rapidly solving the optimization problems to high levels of precision. Furthermore, we show that our algorithm performs well for a wide-range of step sizes, often differing by several orders of magnitude.

Cite this Paper

BibTeX


@InProceedings{pmlr-v51-moritz16,
  title = 	 {A Linearly-Convergent Stochastic L-BFGS Algorithm},
  author = 	 {Moritz, Philipp and Nishihara, Robert and Jordan, Michael},
  booktitle = 	 {Proceedings of the 19th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {249--258},
  year = 	 {2016},
  editor = 	 {Gretton, Arthur and Robert, Christian C.},
  volume = 	 {51},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Cadiz, Spain},
  month = 	 {09--11 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v51/moritz16.pdf},
  url = 	 {https://proceedings.mlr.press/v51/moritz16.html},
  abstract = 	 {We propose a new stochastic L-BFGS algorithm and prove a linear convergence rate for strongly convex and smooth functions. Our algorithm draws heavily from a recent stochastic variant of L-BFGS proposed in Byrd et al. (2014) as well as a recent approach to variance reduction for stochastic gradient descent from Johnson and Zhang (2013). We demonstrate experimentally that our algorithm performs well on large-scale convex and non-convex optimization problems, exhibiting linear convergence and rapidly solving the optimization problems to high levels of precision. Furthermore, we show that our algorithm performs well for a wide-range of step sizes, often differing by several orders of magnitude.}
}

Endnote

%0 Conference Paper
%T A Linearly-Convergent Stochastic L-BFGS Algorithm
%A Philipp Moritz
%A Robert Nishihara
%A Michael Jordan
%B Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2016
%E Arthur Gretton
%E Christian C. Robert	
%F pmlr-v51-moritz16
%I PMLR
%P 249--258
%U https://proceedings.mlr.press/v51/moritz16.html
%V 51
%X We propose a new stochastic L-BFGS algorithm and prove a linear convergence rate for strongly convex and smooth functions. Our algorithm draws heavily from a recent stochastic variant of L-BFGS proposed in Byrd et al. (2014) as well as a recent approach to variance reduction for stochastic gradient descent from Johnson and Zhang (2013). We demonstrate experimentally that our algorithm performs well on large-scale convex and non-convex optimization problems, exhibiting linear convergence and rapidly solving the optimization problems to high levels of precision. Furthermore, we show that our algorithm performs well for a wide-range of step sizes, often differing by several orders of magnitude.

RIS


TY  - CPAPER
TI  - A Linearly-Convergent Stochastic L-BFGS Algorithm
AU  - Philipp Moritz
AU  - Robert Nishihara
AU  - Michael Jordan
BT  - Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
DA  - 2016/05/02
ED  - Arthur Gretton
ED  - Christian C. Robert	
ID  - pmlr-v51-moritz16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 51
SP  - 249
EP  - 258
L1  - http://proceedings.mlr.press/v51/moritz16.pdf
UR  - https://proceedings.mlr.press/v51/moritz16.html
AB  - We propose a new stochastic L-BFGS algorithm and prove a linear convergence rate for strongly convex and smooth functions. Our algorithm draws heavily from a recent stochastic variant of L-BFGS proposed in Byrd et al. (2014) as well as a recent approach to variance reduction for stochastic gradient descent from Johnson and Zhang (2013). We demonstrate experimentally that our algorithm performs well on large-scale convex and non-convex optimization problems, exhibiting linear convergence and rapidly solving the optimization problems to high levels of precision. Furthermore, we show that our algorithm performs well for a wide-range of step sizes, often differing by several orders of magnitude.
ER  -

APA


Moritz, P., Nishihara, R. & Jordan, M.. (2016). A Linearly-Convergent Stochastic L-BFGS Algorithm. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 51:249-258 Available from https://proceedings.mlr.press/v51/moritz16.html.

Related Material

Download PDF