Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems

Nikita Puchkin, Eduard Gorbunov, Nickolay Kutuzov, Alexander Gasnikov
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:856-864, 2024.

Abstract

We consider stochastic optimization problems with heavy-tailed noise with structured density. For such problems, we show that it is possible to get faster rates of convergence than $O(K^{-2(\alpha - 1) / \alpha})$, when the stochastic gradients have finite $\alpha$-th moment, $\alpha \in (1, 2]$. In particular, our analysis allows the noise norm to have an unbounded expectation. To achieve these results, we stabilize stochastic gradients, using smoothed medians of means. We prove that the resulting estimates have negligible bias and controllable variance. This allows us to carefully incorporate them into clipped-SGD and clipped-SSTM and derive new high-probability complexity bounds in the considered setup.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-puchkin24a, title = { Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems }, author = {Puchkin, Nikita and Gorbunov, Eduard and Kutuzov, Nickolay and Gasnikov, Alexander}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {856--864}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/puchkin24a/puchkin24a.pdf}, url = {https://proceedings.mlr.press/v238/puchkin24a.html}, abstract = { We consider stochastic optimization problems with heavy-tailed noise with structured density. For such problems, we show that it is possible to get faster rates of convergence than $O(K^{-2(\alpha - 1) / \alpha})$, when the stochastic gradients have finite $\alpha$-th moment, $\alpha \in (1, 2]$. In particular, our analysis allows the noise norm to have an unbounded expectation. To achieve these results, we stabilize stochastic gradients, using smoothed medians of means. We prove that the resulting estimates have negligible bias and controllable variance. This allows us to carefully incorporate them into clipped-SGD and clipped-SSTM and derive new high-probability complexity bounds in the considered setup. } }
Endnote
%0 Conference Paper %T Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems %A Nikita Puchkin %A Eduard Gorbunov %A Nickolay Kutuzov %A Alexander Gasnikov %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-puchkin24a %I PMLR %P 856--864 %U https://proceedings.mlr.press/v238/puchkin24a.html %V 238 %X We consider stochastic optimization problems with heavy-tailed noise with structured density. For such problems, we show that it is possible to get faster rates of convergence than $O(K^{-2(\alpha - 1) / \alpha})$, when the stochastic gradients have finite $\alpha$-th moment, $\alpha \in (1, 2]$. In particular, our analysis allows the noise norm to have an unbounded expectation. To achieve these results, we stabilize stochastic gradients, using smoothed medians of means. We prove that the resulting estimates have negligible bias and controllable variance. This allows us to carefully incorporate them into clipped-SGD and clipped-SSTM and derive new high-probability complexity bounds in the considered setup.
APA
Puchkin, N., Gorbunov, E., Kutuzov, N. & Gasnikov, A.. (2024). Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems . Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:856-864 Available from https://proceedings.mlr.press/v238/puchkin24a.html.

Related Material