Escaping From Saddle Points — Online Stochastic Gradient for Tensor Decomposition

Rong Ge, Furong Huang, Chi Jin, Yang Yuan
Proceedings of The 28th Conference on Learning Theory, PMLR 40:797-842, 2015.

Abstract

We analyze stochastic gradient descent for optimizing non-convex functions. In many cases for non-convex functions the goal is to find a reasonable local minimum, and the main concern is that gradient updates are trapped in \em saddle points. In this paper we identify \em strict saddle property for non-convex problem that allows for efficient optimization. Using this property we show that from an \em arbitrary starting point, stochastic gradient descent converges to a local minimum in a polynomial number of iterations. To the best of our knowledge this is the first work that gives \em global convergence guarantees for stochastic gradient descent on non-convex functions with exponentially many local minima and saddle points. Our analysis can be applied to orthogonal tensor decomposition, which is widely used in learning a rich class of latent variable models. We propose a new optimization formulation for the tensor decomposition problem that has strict saddle property. As a result we get the first online algorithm for orthogonal tensor decomposition with global convergence guarantee.

Cite this Paper


BibTeX
@InProceedings{pmlr-v40-Ge15, title = {Escaping From Saddle Points --- Online Stochastic Gradient for Tensor Decomposition}, author = {Ge, Rong and Huang, Furong and Jin, Chi and Yuan, Yang}, booktitle = {Proceedings of The 28th Conference on Learning Theory}, pages = {797--842}, year = {2015}, editor = {Grünwald, Peter and Hazan, Elad and Kale, Satyen}, volume = {40}, series = {Proceedings of Machine Learning Research}, address = {Paris, France}, month = {03--06 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v40/Ge15.pdf}, url = {https://proceedings.mlr.press/v40/Ge15.html}, abstract = {We analyze stochastic gradient descent for optimizing non-convex functions. In many cases for non-convex functions the goal is to find a reasonable local minimum, and the main concern is that gradient updates are trapped in \em saddle points. In this paper we identify \em strict saddle property for non-convex problem that allows for efficient optimization. Using this property we show that from an \em arbitrary starting point, stochastic gradient descent converges to a local minimum in a polynomial number of iterations. To the best of our knowledge this is the first work that gives \em global convergence guarantees for stochastic gradient descent on non-convex functions with exponentially many local minima and saddle points. Our analysis can be applied to orthogonal tensor decomposition, which is widely used in learning a rich class of latent variable models. We propose a new optimization formulation for the tensor decomposition problem that has strict saddle property. As a result we get the first online algorithm for orthogonal tensor decomposition with global convergence guarantee.} }
Endnote
%0 Conference Paper %T Escaping From Saddle Points — Online Stochastic Gradient for Tensor Decomposition %A Rong Ge %A Furong Huang %A Chi Jin %A Yang Yuan %B Proceedings of The 28th Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2015 %E Peter Grünwald %E Elad Hazan %E Satyen Kale %F pmlr-v40-Ge15 %I PMLR %P 797--842 %U https://proceedings.mlr.press/v40/Ge15.html %V 40 %X We analyze stochastic gradient descent for optimizing non-convex functions. In many cases for non-convex functions the goal is to find a reasonable local minimum, and the main concern is that gradient updates are trapped in \em saddle points. In this paper we identify \em strict saddle property for non-convex problem that allows for efficient optimization. Using this property we show that from an \em arbitrary starting point, stochastic gradient descent converges to a local minimum in a polynomial number of iterations. To the best of our knowledge this is the first work that gives \em global convergence guarantees for stochastic gradient descent on non-convex functions with exponentially many local minima and saddle points. Our analysis can be applied to orthogonal tensor decomposition, which is widely used in learning a rich class of latent variable models. We propose a new optimization formulation for the tensor decomposition problem that has strict saddle property. As a result we get the first online algorithm for orthogonal tensor decomposition with global convergence guarantee.
RIS
TY - CPAPER TI - Escaping From Saddle Points — Online Stochastic Gradient for Tensor Decomposition AU - Rong Ge AU - Furong Huang AU - Chi Jin AU - Yang Yuan BT - Proceedings of The 28th Conference on Learning Theory DA - 2015/06/26 ED - Peter Grünwald ED - Elad Hazan ED - Satyen Kale ID - pmlr-v40-Ge15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 40 SP - 797 EP - 842 L1 - http://proceedings.mlr.press/v40/Ge15.pdf UR - https://proceedings.mlr.press/v40/Ge15.html AB - We analyze stochastic gradient descent for optimizing non-convex functions. In many cases for non-convex functions the goal is to find a reasonable local minimum, and the main concern is that gradient updates are trapped in \em saddle points. In this paper we identify \em strict saddle property for non-convex problem that allows for efficient optimization. Using this property we show that from an \em arbitrary starting point, stochastic gradient descent converges to a local minimum in a polynomial number of iterations. To the best of our knowledge this is the first work that gives \em global convergence guarantees for stochastic gradient descent on non-convex functions with exponentially many local minima and saddle points. Our analysis can be applied to orthogonal tensor decomposition, which is widely used in learning a rich class of latent variable models. We propose a new optimization formulation for the tensor decomposition problem that has strict saddle property. As a result we get the first online algorithm for orthogonal tensor decomposition with global convergence guarantee. ER -
APA
Ge, R., Huang, F., Jin, C. & Yuan, Y.. (2015). Escaping From Saddle Points — Online Stochastic Gradient for Tensor Decomposition. Proceedings of The 28th Conference on Learning Theory, in Proceedings of Machine Learning Research 40:797-842 Available from https://proceedings.mlr.press/v40/Ge15.html.

Related Material