Stabilized SVRG: Simple Variance Reduction for Nonconvex Optimization

Rong Ge; Zhize Li; Weiyao Wang; Xiang Wang

Stabilized SVRG: Simple Variance Reduction for Nonconvex Optimization

Rong Ge, Zhize Li, Weiyao Wang, Xiang Wang

Proceedings of the Thirty-Second Conference on Learning Theory, PMLR 99:1394-1448, 2019.

Abstract

Variance reduction techniques like SVRG provide simple and fast algorithms for optimizing a convex finite-sum objective. For nonconvex objectives, these techniques can also find a first-order stationary point (with small gradient). However, in nonconvex optimization it is often crucial to find a second-order stationary point (with small gradient and almost PSD hessian). In this paper, we show that Stabilized SVRG (a simple variant of SVRG) can find an

$\epsilon$ -second-order stationary point using only

$\widetilde{O}(n^{2/3}/\epsilon^2+n/\epsilon^{1.5})$ stochastic gradients. To our best knowledge, this is the first second-order guarantee for a simple variant of SVRG. The running time almost matches the known guarantees for finding

$\epsilon$ -first-order stationary points.

Cite this Paper

BibTeX


@InProceedings{pmlr-v99-ge19a,
  title = 	 {Stabilized SVRG: Simple Variance Reduction for Nonconvex Optimization},
  author =       {Ge, Rong and Li, Zhize and Wang, Weiyao and Wang, Xiang},
  booktitle = 	 {Proceedings of the Thirty-Second Conference on Learning Theory},
  pages = 	 {1394--1448},
  year = 	 {2019},
  editor = 	 {Beygelzimer, Alina and Hsu, Daniel},
  volume = 	 {99},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--28 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v99/ge19a/ge19a.pdf},
  url = 	 {https://proceedings.mlr.press/v99/ge19a.html},
  abstract = 	 {Variance reduction techniques like SVRG provide simple and fast algorithms for optimizing a convex finite-sum objective. For nonconvex objectives, these techniques can also find a first-order stationary point (with small gradient). However, in nonconvex optimization it is often crucial to find a second-order stationary point (with small gradient and almost PSD hessian). In this paper, we show that Stabilized SVRG (a simple variant of SVRG) can find an $\epsilon$-second-order stationary point using only $\widetilde{O}(n^{2/3}/\epsilon^2+n/\epsilon^{1.5})$ stochastic gradients. To our best knowledge, this is the first second-order guarantee for a simple variant of SVRG. The running time almost matches the known guarantees for finding $\epsilon$-first-order stationary points.}
}

Endnote

%0 Conference Paper
%T Stabilized SVRG: Simple Variance Reduction for Nonconvex Optimization
%A Rong Ge
%A Zhize Li
%A Weiyao Wang
%A Xiang Wang
%B Proceedings of the Thirty-Second Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2019
%E Alina Beygelzimer
%E Daniel Hsu	
%F pmlr-v99-ge19a
%I PMLR
%P 1394--1448
%U https://proceedings.mlr.press/v99/ge19a.html
%V 99
%X Variance reduction techniques like SVRG provide simple and fast algorithms for optimizing a convex finite-sum objective. For nonconvex objectives, these techniques can also find a first-order stationary point (with small gradient). However, in nonconvex optimization it is often crucial to find a second-order stationary point (with small gradient and almost PSD hessian). In this paper, we show that Stabilized SVRG (a simple variant of SVRG) can find an $\epsilon$-second-order stationary point using only $\widetilde{O}(n^{2/3}/\epsilon^2+n/\epsilon^{1.5})$ stochastic gradients. To our best knowledge, this is the first second-order guarantee for a simple variant of SVRG. The running time almost matches the known guarantees for finding $\epsilon$-first-order stationary points.

APA


Ge, R., Li, Z., Wang, W. & Wang, X.. (2019). Stabilized SVRG: Simple Variance Reduction for Nonconvex Optimization. Proceedings of the Thirty-Second Conference on Learning Theory, in Proceedings of Machine Learning Research 99:1394-1448 Available from https://proceedings.mlr.press/v99/ge19a.html.

Related Material

Download PDF