Stochastic Nonconvex Optimization with Large Minibatches

Weiran Wang, Nathan Srebro
Proceedings of the 30th International Conference on Algorithmic Learning Theory, PMLR 98:857-882, 2019.

Abstract

We study stochastic optimization of nonconvex loss functions, which are typical objectives for training neural networks. We propose stochastic approximation algorithms which optimize a series of regularized, nonlinearized losses on large minibatches of samples, using only first-order gradient information. Our algorithms provably converge to an approximate critical point of the expected objective with faster rates than minibatch stochastic gradient descent, and facilitate better parallelization by allowing larger minibatches.

Cite this Paper


BibTeX
@InProceedings{pmlr-v98-wang19a, title = {Stochastic Nonconvex Optimization with Large Minibatches}, author = {Wang, Weiran and Srebro, Nathan}, booktitle = {Proceedings of the 30th International Conference on Algorithmic Learning Theory}, pages = {857--882}, year = {2019}, editor = {Garivier, Aurélien and Kale, Satyen}, volume = {98}, series = {Proceedings of Machine Learning Research}, month = {22--24 Mar}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v98/wang19a/wang19a.pdf}, url = {https://proceedings.mlr.press/v98/wang19a.html}, abstract = {We study stochastic optimization of nonconvex loss functions, which are typical objectives for training neural networks. We propose stochastic approximation algorithms which optimize a series of regularized, nonlinearized losses on large minibatches of samples, using only first-order gradient information. Our algorithms provably converge to an approximate critical point of the expected objective with faster rates than minibatch stochastic gradient descent, and facilitate better parallelization by allowing larger minibatches.} }
Endnote
%0 Conference Paper %T Stochastic Nonconvex Optimization with Large Minibatches %A Weiran Wang %A Nathan Srebro %B Proceedings of the 30th International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2019 %E Aurélien Garivier %E Satyen Kale %F pmlr-v98-wang19a %I PMLR %P 857--882 %U https://proceedings.mlr.press/v98/wang19a.html %V 98 %X We study stochastic optimization of nonconvex loss functions, which are typical objectives for training neural networks. We propose stochastic approximation algorithms which optimize a series of regularized, nonlinearized losses on large minibatches of samples, using only first-order gradient information. Our algorithms provably converge to an approximate critical point of the expected objective with faster rates than minibatch stochastic gradient descent, and facilitate better parallelization by allowing larger minibatches.
APA
Wang, W. & Srebro, N.. (2019). Stochastic Nonconvex Optimization with Large Minibatches. Proceedings of the 30th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 98:857-882 Available from https://proceedings.mlr.press/v98/wang19a.html.

Related Material