A Generic Approach for Escaping Saddle points

Sashank Reddi, Manzil Zaheer, Suvrit Sra, Barnabas Poczos, Francis Bach, Ruslan Salakhutdinov, Alex Smola
Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:1233-1242, 2018.

Abstract

A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points. First-order methods often get stuck at saddle points, greatly deteriorating their performance. Typically, to escape from saddles one has to use second-order methods. However, most works on second-order methods rely extensively on expensive Hessian-based computations, making them impractical in large-scale settings. To tackle this challenge, we introduce a generic framework that minimizes Hessian-based computations while at the same time provably converging to second-order critical points. Our framework carefully alternates between a first-order and a second-order subroutine, using the latter only close to saddle points, and yields convergence results competitive to the state-of-the-art. Empirical results suggest that our strategy also enjoys a good practical performance.

Cite this Paper


BibTeX
@InProceedings{pmlr-v84-reddi18a, title = {A Generic Approach for Escaping Saddle points}, author = {Reddi, Sashank and Zaheer, Manzil and Sra, Suvrit and Poczos, Barnabas and Bach, Francis and Salakhutdinov, Ruslan and Smola, Alex}, booktitle = {Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics}, pages = {1233--1242}, year = {2018}, editor = {Storkey, Amos and Perez-Cruz, Fernando}, volume = {84}, series = {Proceedings of Machine Learning Research}, month = {09--11 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v84/reddi18a/reddi18a.pdf}, url = {https://proceedings.mlr.press/v84/reddi18a.html}, abstract = {A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points. First-order methods often get stuck at saddle points, greatly deteriorating their performance. Typically, to escape from saddles one has to use second-order methods. However, most works on second-order methods rely extensively on expensive Hessian-based computations, making them impractical in large-scale settings. To tackle this challenge, we introduce a generic framework that minimizes Hessian-based computations while at the same time provably converging to second-order critical points. Our framework carefully alternates between a first-order and a second-order subroutine, using the latter only close to saddle points, and yields convergence results competitive to the state-of-the-art. Empirical results suggest that our strategy also enjoys a good practical performance.} }
Endnote
%0 Conference Paper %T A Generic Approach for Escaping Saddle points %A Sashank Reddi %A Manzil Zaheer %A Suvrit Sra %A Barnabas Poczos %A Francis Bach %A Ruslan Salakhutdinov %A Alex Smola %B Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2018 %E Amos Storkey %E Fernando Perez-Cruz %F pmlr-v84-reddi18a %I PMLR %P 1233--1242 %U https://proceedings.mlr.press/v84/reddi18a.html %V 84 %X A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points. First-order methods often get stuck at saddle points, greatly deteriorating their performance. Typically, to escape from saddles one has to use second-order methods. However, most works on second-order methods rely extensively on expensive Hessian-based computations, making them impractical in large-scale settings. To tackle this challenge, we introduce a generic framework that minimizes Hessian-based computations while at the same time provably converging to second-order critical points. Our framework carefully alternates between a first-order and a second-order subroutine, using the latter only close to saddle points, and yields convergence results competitive to the state-of-the-art. Empirical results suggest that our strategy also enjoys a good practical performance.
APA
Reddi, S., Zaheer, M., Sra, S., Poczos, B., Bach, F., Salakhutdinov, R. & Smola, A.. (2018). A Generic Approach for Escaping Saddle points. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:1233-1242 Available from https://proceedings.mlr.press/v84/reddi18a.html.

Related Material