On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent

Scott Pesme, Aymeric Dieuleveut, Nicolas Flammarion
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:7641-7651, 2020.

Abstract

Constant step-size Stochastic Gradient Descent exhibits two phases: a transient phase during which iterates make fast progress towards the optimum, followed by a stationary phase during which iterates oscillate around the optimal point. In this paper, we show that efficiently detecting this transition and appropriately decreasing the step size can lead to fast convergence rates. We analyse the classical statistical test proposed by Pflug (1983), based on the inner product between consecutive stochastic gradients. Even in the simple case where the objective function is quadratic we show that this test cannot lead to an adequate convergence diagnostic. We then propose a novel and simple statistical procedure that accurately detects stationarity and we provide experimental results showing state-of-the-art performance on synthetic and real-word datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-pesme20a, title = {On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent}, author = {Pesme, Scott and Dieuleveut, Aymeric and Flammarion, Nicolas}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {7641--7651}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/pesme20a/pesme20a.pdf}, url = {https://proceedings.mlr.press/v119/pesme20a.html}, abstract = {Constant step-size Stochastic Gradient Descent exhibits two phases: a transient phase during which iterates make fast progress towards the optimum, followed by a stationary phase during which iterates oscillate around the optimal point. In this paper, we show that efficiently detecting this transition and appropriately decreasing the step size can lead to fast convergence rates. We analyse the classical statistical test proposed by Pflug (1983), based on the inner product between consecutive stochastic gradients. Even in the simple case where the objective function is quadratic we show that this test cannot lead to an adequate convergence diagnostic. We then propose a novel and simple statistical procedure that accurately detects stationarity and we provide experimental results showing state-of-the-art performance on synthetic and real-word datasets.} }
Endnote
%0 Conference Paper %T On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent %A Scott Pesme %A Aymeric Dieuleveut %A Nicolas Flammarion %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-pesme20a %I PMLR %P 7641--7651 %U https://proceedings.mlr.press/v119/pesme20a.html %V 119 %X Constant step-size Stochastic Gradient Descent exhibits two phases: a transient phase during which iterates make fast progress towards the optimum, followed by a stationary phase during which iterates oscillate around the optimal point. In this paper, we show that efficiently detecting this transition and appropriately decreasing the step size can lead to fast convergence rates. We analyse the classical statistical test proposed by Pflug (1983), based on the inner product between consecutive stochastic gradients. Even in the simple case where the objective function is quadratic we show that this test cannot lead to an adequate convergence diagnostic. We then propose a novel and simple statistical procedure that accurately detects stationarity and we provide experimental results showing state-of-the-art performance on synthetic and real-word datasets.
APA
Pesme, S., Dieuleveut, A. & Flammarion, N.. (2020). On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:7641-7651 Available from https://proceedings.mlr.press/v119/pesme20a.html.

Related Material