Stochastic Recursive Variance-Reduced Cubic Regularization Methods

Dongruo Zhou, Quanquan Gu
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:3980-3990, 2020.

Abstract

Stochastic Variance-Reduced Cubic regularization (SVRC) algorithms have received increasing attention due to its improved gradient/Hessian complexities (i.e., number of queries to stochastic gradient/Hessian oracles) to find local minima for nonconvex finite-sum optimization. However, it is unclear whether existing SVRC algorithms can be further improved. Moreover, the semi-stochastic Hessian estimator adopted in existing SVRC algorithms prevents the use of Hessian-vector product-based fast cubic subproblem solvers, which makes SVRC algorithms computationally intractable for high-dimensional problems. In this paper, we first present a Stochastic Recursive Variance-Reduced Cubic regularization method (SRVRC) using a recursively updated semi-stochastic gradient and Hessian estimators. It enjoys improved gradient and Hessian complexities to find an $(\epsilon, \sqrt{\epsilon})$-approximate local minimum, and outperforms the state-of-the-art SVRC algorithms. Built upon SRVRC, we further propose a Hessian-free SRVRC algorithm, namely SRVRC$_{\text{free}}$, which only needs $\tilde O(n\epsilon^{-2} \land \epsilon^{-3})$ stochastic gradient and Hessian-vector product computations, where $n$ is the number of component functions in the finite-sum objective and $\epsilon$ is the optimization precision. This outperforms the best-known result $\tilde O(\epsilon^{-3.5})$ achieved by stochastic cubic regularization algorithm proposed in \cite{tripuraneni2018stochastic}.

Cite this Paper


BibTeX
@InProceedings{pmlr-v108-zhou20a, title = {Stochastic Recursive Variance-Reduced Cubic Regularization Methods}, author = {Zhou, Dongruo and Gu, Quanquan}, booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, pages = {3980--3990}, year = {2020}, editor = {Silvia Chiappa and Roberto Calandra}, volume = {108}, series = {Proceedings of Machine Learning Research}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/zhou20a/zhou20a.pdf}, url = { http://proceedings.mlr.press/v108/zhou20a.html }, abstract = {Stochastic Variance-Reduced Cubic regularization (SVRC) algorithms have received increasing attention due to its improved gradient/Hessian complexities (i.e., number of queries to stochastic gradient/Hessian oracles) to find local minima for nonconvex finite-sum optimization. However, it is unclear whether existing SVRC algorithms can be further improved. Moreover, the semi-stochastic Hessian estimator adopted in existing SVRC algorithms prevents the use of Hessian-vector product-based fast cubic subproblem solvers, which makes SVRC algorithms computationally intractable for high-dimensional problems. In this paper, we first present a Stochastic Recursive Variance-Reduced Cubic regularization method (SRVRC) using a recursively updated semi-stochastic gradient and Hessian estimators. It enjoys improved gradient and Hessian complexities to find an $(\epsilon, \sqrt{\epsilon})$-approximate local minimum, and outperforms the state-of-the-art SVRC algorithms. Built upon SRVRC, we further propose a Hessian-free SRVRC algorithm, namely SRVRC$_{\text{free}}$, which only needs $\tilde O(n\epsilon^{-2} \land \epsilon^{-3})$ stochastic gradient and Hessian-vector product computations, where $n$ is the number of component functions in the finite-sum objective and $\epsilon$ is the optimization precision. This outperforms the best-known result $\tilde O(\epsilon^{-3.5})$ achieved by stochastic cubic regularization algorithm proposed in \cite{tripuraneni2018stochastic}. } }
Endnote
%0 Conference Paper %T Stochastic Recursive Variance-Reduced Cubic Regularization Methods %A Dongruo Zhou %A Quanquan Gu %B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2020 %E Silvia Chiappa %E Roberto Calandra %F pmlr-v108-zhou20a %I PMLR %P 3980--3990 %U http://proceedings.mlr.press/v108/zhou20a.html %V 108 %X Stochastic Variance-Reduced Cubic regularization (SVRC) algorithms have received increasing attention due to its improved gradient/Hessian complexities (i.e., number of queries to stochastic gradient/Hessian oracles) to find local minima for nonconvex finite-sum optimization. However, it is unclear whether existing SVRC algorithms can be further improved. Moreover, the semi-stochastic Hessian estimator adopted in existing SVRC algorithms prevents the use of Hessian-vector product-based fast cubic subproblem solvers, which makes SVRC algorithms computationally intractable for high-dimensional problems. In this paper, we first present a Stochastic Recursive Variance-Reduced Cubic regularization method (SRVRC) using a recursively updated semi-stochastic gradient and Hessian estimators. It enjoys improved gradient and Hessian complexities to find an $(\epsilon, \sqrt{\epsilon})$-approximate local minimum, and outperforms the state-of-the-art SVRC algorithms. Built upon SRVRC, we further propose a Hessian-free SRVRC algorithm, namely SRVRC$_{\text{free}}$, which only needs $\tilde O(n\epsilon^{-2} \land \epsilon^{-3})$ stochastic gradient and Hessian-vector product computations, where $n$ is the number of component functions in the finite-sum objective and $\epsilon$ is the optimization precision. This outperforms the best-known result $\tilde O(\epsilon^{-3.5})$ achieved by stochastic cubic regularization algorithm proposed in \cite{tripuraneni2018stochastic}.
APA
Zhou, D. & Gu, Q.. (2020). Stochastic Recursive Variance-Reduced Cubic Regularization Methods. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:3980-3990 Available from http://proceedings.mlr.press/v108/zhou20a.html .

Related Material