Accelerated SGD for Non-Strongly-Convex Least Squares

Aditya Varre; Nicolas Flammarion

Accelerated SGD for Non-Strongly-Convex Least Squares

Aditya Varre, Nicolas Flammarion

Proceedings of Thirty Fifth Conference on Learning Theory, PMLR 178:2062-2126, 2022.

Abstract

We consider stochastic approximation for the least squares regression problem in the non-strongly convex setting. We present the first practical algorithm that achieves the optimal prediction error rates in terms of dependence on the noise of the problem, as

$O(d/t)$ while accelerating the forgetting of the initial conditions to

$O(d/t^2)$ . Our new algorithm is based on a simple modification of the accelerated gradient descent. We provide convergence results for both the averaged and the last iterate of the algorithm. In order to describe the tightness of these new bounds, we present a matching lower bound in the noiseless setting and thus show the optimality of our algorithm.

Cite this Paper

BibTeX


@InProceedings{pmlr-v178-varre22a,
  title = 	 {Accelerated SGD for Non-Strongly-Convex Least Squares},
  author =       {Varre, Aditya and Flammarion, Nicolas},
  booktitle = 	 {Proceedings of Thirty Fifth Conference on Learning Theory},
  pages = 	 {2062--2126},
  year = 	 {2022},
  editor = 	 {Loh, Po-Ling and Raginsky, Maxim},
  volume = 	 {178},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {02--05 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v178/varre22a/varre22a.pdf},
  url = 	 {https://proceedings.mlr.press/v178/varre22a.html},
  abstract = 	 {We consider stochastic approximation for the least squares regression problem in the non-strongly convex setting. We present the first practical algorithm that achieves the optimal prediction error rates in terms of dependence on the noise of the problem, as $O(d/t)$ while accelerating the forgetting of the initial conditions to $O(d/t^2)$. Our new algorithm is based on a simple modification of the accelerated gradient descent. We provide convergence results for both the averaged and the last iterate of the algorithm. In order to describe the tightness of these new bounds, we present a matching lower bound in the noiseless setting and thus show the optimality of our algorithm.}
}

Endnote

%0 Conference Paper
%T Accelerated SGD for Non-Strongly-Convex Least Squares
%A Aditya Varre
%A Nicolas Flammarion
%B Proceedings of Thirty Fifth Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2022
%E Po-Ling Loh
%E Maxim Raginsky	
%F pmlr-v178-varre22a
%I PMLR
%P 2062--2126
%U https://proceedings.mlr.press/v178/varre22a.html
%V 178
%X We consider stochastic approximation for the least squares regression problem in the non-strongly convex setting. We present the first practical algorithm that achieves the optimal prediction error rates in terms of dependence on the noise of the problem, as $O(d/t)$ while accelerating the forgetting of the initial conditions to $O(d/t^2)$. Our new algorithm is based on a simple modification of the accelerated gradient descent. We provide convergence results for both the averaged and the last iterate of the algorithm. In order to describe the tightness of these new bounds, we present a matching lower bound in the noiseless setting and thus show the optimality of our algorithm.

APA


Varre, A. & Flammarion, N.. (2022). Accelerated SGD for Non-Strongly-Convex Least Squares. Proceedings of Thirty Fifth Conference on Learning Theory, in Proceedings of Machine Learning Research 178:2062-2126 Available from https://proceedings.mlr.press/v178/varre22a.html.

Related Material

Download PDF