A Generic Approach for Escaping Saddle points

Sashank Reddi; Manzil Zaheer; Suvrit Sra; Barnabas Poczos; Francis Bach; Ruslan Salakhutdinov; Alex Smola

A Generic Approach for Escaping Saddle points

Sashank Reddi, Manzil Zaheer, Suvrit Sra, Barnabas Poczos, Francis Bach, Ruslan Salakhutdinov, Alex Smola

Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:1233-1242, 2018.

Abstract

A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points. First-order methods often get stuck at saddle points, greatly deteriorating their performance. Typically, to escape from saddles one has to use second-order methods. However, most works on second-order methods rely extensively on expensive Hessian-based computations, making them impractical in large-scale settings. To tackle this challenge, we introduce a generic framework that minimizes Hessian-based computations while at the same time provably converging to second-order critical points. Our framework carefully alternates between a first-order and a second-order subroutine, using the latter only close to saddle points, and yields convergence results competitive to the state-of-the-art. Empirical results suggest that our strategy also enjoys a good practical performance.

Cite this Paper

BibTeX

@InProceedings{pmlr-v84-reddi18a,
  title = 	 {A Generic Approach for Escaping Saddle points},
  author = 	 {Reddi, Sashank and Zaheer, Manzil and Sra, Suvrit and Poczos, Barnabas and Bach, Francis and Salakhutdinov, Ruslan and Smola, Alex},
  booktitle = 	 {Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1233--1242},
  year = 	 {2018},
  editor = 	 {Storkey, Amos and Perez-Cruz, Fernando},
  volume = 	 {84},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--11 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v84/reddi18a/reddi18a.pdf},
  url = 	 {https://proceedings.mlr.press/v84/reddi18a.html},
  abstract = 	 {A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points. First-order methods often get stuck at saddle points, greatly deteriorating their performance. Typically, to escape from saddles one has to use second-order methods. However, most works on second-order methods rely extensively on expensive Hessian-based computations, making them impractical in large-scale settings. To tackle this challenge, we introduce a generic framework that minimizes Hessian-based computations while at the same time provably converging to second-order critical points. Our framework carefully alternates between a first-order and a second-order subroutine, using the latter only close to saddle points, and yields convergence results competitive to the state-of-the-art. Empirical results suggest that our strategy also enjoys a good practical performance.}
}

Endnote

%0 Conference Paper
%T A Generic Approach for Escaping Saddle points
%A Sashank Reddi
%A Manzil Zaheer
%A Suvrit Sra
%A Barnabas Poczos
%A Francis Bach
%A Ruslan Salakhutdinov
%A Alex Smola
%B Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2018
%E Amos Storkey
%E Fernando Perez-Cruz	
%F pmlr-v84-reddi18a
%I PMLR
%P 1233--1242
%U https://proceedings.mlr.press/v84/reddi18a.html
%V 84
%X A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points. First-order methods often get stuck at saddle points, greatly deteriorating their performance. Typically, to escape from saddles one has to use second-order methods. However, most works on second-order methods rely extensively on expensive Hessian-based computations, making them impractical in large-scale settings. To tackle this challenge, we introduce a generic framework that minimizes Hessian-based computations while at the same time provably converging to second-order critical points. Our framework carefully alternates between a first-order and a second-order subroutine, using the latter only close to saddle points, and yields convergence results competitive to the state-of-the-art. Empirical results suggest that our strategy also enjoys a good practical performance.

APA

Reddi, S., Zaheer, M., Sra, S., Poczos, B., Bach, F., Salakhutdinov, R. & Smola, A.. (2018). A Generic Approach for Escaping Saddle points. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:1233-1242 Available from https://proceedings.mlr.press/v84/reddi18a.html.

A Generic Approach for Escaping Saddle points

Abstract

Cite this Paper

Related Material