Stochastic Variance Reduction for Nonconvex Optimization

Sashank J. Reddi; Ahmed Hefny; Suvrit Sra; Barnabas Poczos; Alex Smola

Stochastic Variance Reduction for Nonconvex Optimization

Sashank J. Reddi, Ahmed Hefny, Suvrit Sra, Barnabas Poczos, Alex Smola

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:314-323, 2016.

Abstract

We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (SVRG) methods for them. SVRG and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient descent (SGD); but their theoretical analysis almost exclusively assumes convexity. In contrast, we prove non-asymptotic rates of convergence (to stationary points) of SVRG for nonconvex optimization, and show that it is provably faster than SGD and gradient descent. We also analyze a subclass of nonconvex problems on which SVRG attains linear convergence to the global optimum. We extend our analysis to mini-batch variants of SVRG, showing (theoretical) linear speedup due to minibatching in parallel settings.

Cite this Paper

BibTeX

@InProceedings{pmlr-v48-reddi16,
  title = 	 {Stochastic Variance Reduction for Nonconvex Optimization},
  author = 	 {Reddi, Sashank J. and Hefny, Ahmed and Sra, Suvrit and Poczos, Barnabas and Smola, Alex},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {314--323},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/reddi16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/reddi16.html},
  abstract = 	 {We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (SVRG) methods for them. SVRG and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient descent (SGD); but their theoretical analysis almost exclusively assumes convexity. In contrast, we prove non-asymptotic rates of convergence (to stationary points) of SVRG for nonconvex optimization, and show that it is provably faster than SGD and gradient descent. We also analyze a subclass of nonconvex problems on which SVRG attains linear convergence to the global optimum. We extend our analysis to mini-batch variants of SVRG, showing (theoretical) linear speedup due to minibatching in parallel settings.}
}

Endnote

%0 Conference Paper
%T Stochastic Variance Reduction for Nonconvex Optimization
%A Sashank J. Reddi
%A Ahmed Hefny
%A Suvrit Sra
%A Barnabas Poczos
%A Alex Smola
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-reddi16
%I PMLR
%P 314--323
%U https://proceedings.mlr.press/v48/reddi16.html
%V 48
%X We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (SVRG) methods for them. SVRG and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient descent (SGD); but their theoretical analysis almost exclusively assumes convexity. In contrast, we prove non-asymptotic rates of convergence (to stationary points) of SVRG for nonconvex optimization, and show that it is provably faster than SGD and gradient descent. We also analyze a subclass of nonconvex problems on which SVRG attains linear convergence to the global optimum. We extend our analysis to mini-batch variants of SVRG, showing (theoretical) linear speedup due to minibatching in parallel settings.

RIS

TY  - CPAPER
TI  - Stochastic Variance Reduction for Nonconvex Optimization
AU  - Sashank J. Reddi
AU  - Ahmed Hefny
AU  - Suvrit Sra
AU  - Barnabas Poczos
AU  - Alex Smola
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-reddi16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 314
EP  - 323
L1  - http://proceedings.mlr.press/v48/reddi16.pdf
UR  - https://proceedings.mlr.press/v48/reddi16.html
AB  - We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (SVRG) methods for them. SVRG and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient descent (SGD); but their theoretical analysis almost exclusively assumes convexity. In contrast, we prove non-asymptotic rates of convergence (to stationary points) of SVRG for nonconvex optimization, and show that it is provably faster than SGD and gradient descent. We also analyze a subclass of nonconvex problems on which SVRG attains linear convergence to the global optimum. We extend our analysis to mini-batch variants of SVRG, showing (theoretical) linear speedup due to minibatching in parallel settings.
ER  -

APA

Reddi, S.J., Hefny, A., Sra, S., Poczos, B. & Smola, A.. (2016). Stochastic Variance Reduction for Nonconvex Optimization. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:314-323 Available from https://proceedings.mlr.press/v48/reddi16.html.

Stochastic Variance Reduction for Nonconvex Optimization

Abstract

Cite this Paper

Related Material