Stochastic Approximation of Smooth and Strongly Convex Functions: Beyond the $O(1/T)$ Convergence Rate

Lijun Zhang, Zhi-Hua Zhou
Proceedings of the Thirty-Second Conference on Learning Theory, PMLR 99:3160-3179, 2019.

Abstract

Stochastic approximation (SA) is a classical approach for stochastic convex optimization. Previous studies have demonstrated that the convergence rate of SA can be improved by introducing either smoothness or strong convexity condition. In this paper, we make use of smoothness and strong convexity simultaneously to boost the convergence rate. Let $\lambda$ be the modulus of strong convexity, $\kappa$ be the condition number, $F_*$ be the minimal risk, and $\alpha>1$ be some small constant. First, we demonstrate that, in expectation, an $O(1/[\lambda T^\alpha] + \kappa F_*/T)$ risk bound is attainable when $T = \Omega(\kappa^\alpha)$. Thus, when $F_*$ is small, the convergence rate could be faster than $O(1/[\lambda T])$ and approaches $O(1/[\lambda T^\alpha])$ in the ideal case. Second, to further benefit from small risk, we show that, in expectation, an $O(1/2^{T/\kappa}+F_*)$ risk bound is achievable. Thus, the excess risk reduces exponentially until reaching $O(F_*)$, and if $F_*=0$, we obtain a global linear convergence. Finally, we emphasize that our proof is constructive and each risk bound is equipped with an efficient stochastic algorithm attaining that bound.

Cite this Paper


BibTeX
@InProceedings{pmlr-v99-zhang19a, title = {Stochastic Approximation of Smooth and Strongly Convex Functions: Beyond the $O(1/T)$ Convergence Rate}, author = {Zhang, Lijun and Zhou, Zhi-Hua}, booktitle = {Proceedings of the Thirty-Second Conference on Learning Theory}, pages = {3160--3179}, year = {2019}, editor = {Beygelzimer, Alina and Hsu, Daniel}, volume = {99}, series = {Proceedings of Machine Learning Research}, month = {25--28 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v99/zhang19a/zhang19a.pdf}, url = {https://proceedings.mlr.press/v99/zhang19a.html}, abstract = {Stochastic approximation (SA) is a classical approach for stochastic convex optimization. Previous studies have demonstrated that the convergence rate of SA can be improved by introducing either smoothness or strong convexity condition. In this paper, we make use of smoothness and strong convexity simultaneously to boost the convergence rate. Let $\lambda$ be the modulus of strong convexity, $\kappa$ be the condition number, $F_*$ be the minimal risk, and $\alpha>1$ be some small constant. First, we demonstrate that, in expectation, an $O(1/[\lambda T^\alpha] + \kappa F_*/T)$ risk bound is attainable when $T = \Omega(\kappa^\alpha)$. Thus, when $F_*$ is small, the convergence rate could be faster than $O(1/[\lambda T])$ and approaches $O(1/[\lambda T^\alpha])$ in the ideal case. Second, to further benefit from small risk, we show that, in expectation, an $O(1/2^{T/\kappa}+F_*)$ risk bound is achievable. Thus, the excess risk reduces exponentially until reaching $O(F_*)$, and if $F_*=0$, we obtain a global linear convergence. Finally, we emphasize that our proof is constructive and each risk bound is equipped with an efficient stochastic algorithm attaining that bound.} }
Endnote
%0 Conference Paper %T Stochastic Approximation of Smooth and Strongly Convex Functions: Beyond the $O(1/T)$ Convergence Rate %A Lijun Zhang %A Zhi-Hua Zhou %B Proceedings of the Thirty-Second Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2019 %E Alina Beygelzimer %E Daniel Hsu %F pmlr-v99-zhang19a %I PMLR %P 3160--3179 %U https://proceedings.mlr.press/v99/zhang19a.html %V 99 %X Stochastic approximation (SA) is a classical approach for stochastic convex optimization. Previous studies have demonstrated that the convergence rate of SA can be improved by introducing either smoothness or strong convexity condition. In this paper, we make use of smoothness and strong convexity simultaneously to boost the convergence rate. Let $\lambda$ be the modulus of strong convexity, $\kappa$ be the condition number, $F_*$ be the minimal risk, and $\alpha>1$ be some small constant. First, we demonstrate that, in expectation, an $O(1/[\lambda T^\alpha] + \kappa F_*/T)$ risk bound is attainable when $T = \Omega(\kappa^\alpha)$. Thus, when $F_*$ is small, the convergence rate could be faster than $O(1/[\lambda T])$ and approaches $O(1/[\lambda T^\alpha])$ in the ideal case. Second, to further benefit from small risk, we show that, in expectation, an $O(1/2^{T/\kappa}+F_*)$ risk bound is achievable. Thus, the excess risk reduces exponentially until reaching $O(F_*)$, and if $F_*=0$, we obtain a global linear convergence. Finally, we emphasize that our proof is constructive and each risk bound is equipped with an efficient stochastic algorithm attaining that bound.
APA
Zhang, L. & Zhou, Z.. (2019). Stochastic Approximation of Smooth and Strongly Convex Functions: Beyond the $O(1/T)$ Convergence Rate. Proceedings of the Thirty-Second Conference on Learning Theory, in Proceedings of Machine Learning Research 99:3160-3179 Available from https://proceedings.mlr.press/v99/zhang19a.html.

Related Material