[edit]
Natasha: Faster Non-Convex Stochastic Optimization via Strongly Non-Convex Parameter
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:89-97, 2017.
Abstract
Given a non-convex function f(x) that is an average of n smooth functions, we design stochastic first-order methods to find its approximate stationary points. The performance of our new methods depend on the smallest (negative) eigenvalue −σ of the Hessian. This parameter σ captures how strongly non-convex f(x) is, and is analogous to the strong convexity parameter for convex optimization. At least in theory, our methods outperform known results for a range of parameter σ, and can also be used to find approximate local minima. Our result implies an interesting dichotomy: there exists a threshold σ0 so that the (currently) fastest methods for σ>σ0 and for σ<σ0 have different behaviors: the former scales with n2/3 and the latter scales with n3/4.