Nonconvex Variance Reduced Optimization with Arbitrary Sampling

Samuel Horváth, Peter Richtarik
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:2781-2789, 2019.

Abstract

We provide the first importance sampling variants of variance reduced algorithms for empirical risk minimization with non-convex loss functions. In particular, we analyze non-convex versions of \texttt{SVRG}, \texttt{SAGA} and \texttt{SARAH}. Our methods have the capacity to speed up the training process by an order of magnitude compared to the state of the art on real datasets. Moreover, we also improve upon current mini-batch analysis of these methods by proposing importance sampling for minibatches in this setting. Surprisingly, our approach can in some regimes lead to superlinear speedup with respect to the minibatch size, which is not usually present in stochastic optimization. All the above results follow from a general analysis of the methods which works with arbitrary sampling, i.e., fully general randomized strategy for the selection of subsets of examples to be sampled in each iteration. Finally, we also perform a novel importance sampling analysis of \texttt{SARAH} in the convex setting.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-horvath19a, title = {Nonconvex Variance Reduced Optimization with Arbitrary Sampling}, author = {Horv{\'a}th, Samuel and Richtarik, Peter}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {2781--2789}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/horvath19a/horvath19a.pdf}, url = {https://proceedings.mlr.press/v97/horvath19a.html}, abstract = {We provide the first importance sampling variants of variance reduced algorithms for empirical risk minimization with non-convex loss functions. In particular, we analyze non-convex versions of \texttt{SVRG}, \texttt{SAGA} and \texttt{SARAH}. Our methods have the capacity to speed up the training process by an order of magnitude compared to the state of the art on real datasets. Moreover, we also improve upon current mini-batch analysis of these methods by proposing importance sampling for minibatches in this setting. Surprisingly, our approach can in some regimes lead to superlinear speedup with respect to the minibatch size, which is not usually present in stochastic optimization. All the above results follow from a general analysis of the methods which works with arbitrary sampling, i.e., fully general randomized strategy for the selection of subsets of examples to be sampled in each iteration. Finally, we also perform a novel importance sampling analysis of \texttt{SARAH} in the convex setting.} }
Endnote
%0 Conference Paper %T Nonconvex Variance Reduced Optimization with Arbitrary Sampling %A Samuel Horváth %A Peter Richtarik %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-horvath19a %I PMLR %P 2781--2789 %U https://proceedings.mlr.press/v97/horvath19a.html %V 97 %X We provide the first importance sampling variants of variance reduced algorithms for empirical risk minimization with non-convex loss functions. In particular, we analyze non-convex versions of \texttt{SVRG}, \texttt{SAGA} and \texttt{SARAH}. Our methods have the capacity to speed up the training process by an order of magnitude compared to the state of the art on real datasets. Moreover, we also improve upon current mini-batch analysis of these methods by proposing importance sampling for minibatches in this setting. Surprisingly, our approach can in some regimes lead to superlinear speedup with respect to the minibatch size, which is not usually present in stochastic optimization. All the above results follow from a general analysis of the methods which works with arbitrary sampling, i.e., fully general randomized strategy for the selection of subsets of examples to be sampled in each iteration. Finally, we also perform a novel importance sampling analysis of \texttt{SARAH} in the convex setting.
APA
Horváth, S. & Richtarik, P.. (2019). Nonconvex Variance Reduced Optimization with Arbitrary Sampling. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:2781-2789 Available from https://proceedings.mlr.press/v97/horvath19a.html.

Related Material