Online Convex Optimization with Heavy Tails: Old Algorithms, New Regrets, and Applications

Zijian Liu
Proceedings of The 37th International Conference on Algorithmic Learning Theory, PMLR 313:1-47, 2026.

Abstract

In Online Convex Optimization (OCO), when the stochastic gradient has a finite variance, many algorithms provably work and guarantee a sublinear regret. However, limited results are known if the gradient estimate has a heavy tail, i.e., the stochastic gradient only admits a finite $\mathsf{p}$-th central moment for some $\mathsf{p}\in\left(1,2\right]$. Motivated by it, this work examines different old algorithms for OCO (e.g., Online Gradient Descent) in the more challenging heavy-tailed setting. Under the standard bounded domain assumption, we establish new regrets for these classical methods without any algorithmic modification. Remarkably, these regret bounds are fully optimal in all parameters (can be achieved even without knowing $\mathsf{p}$), suggesting that OCO with heavy tails can be solved effectively without any extra operation (e.g., gradient clipping). Our new results have several applications. A particularly interesting one is the first provable and optimal convergence result for nonsmooth nonconvex optimization under heavy-tailed noise without gradient clipping.

Cite this Paper


BibTeX
@InProceedings{pmlr-v313-liu26a, title = {Online Convex Optimization with Heavy Tails: Old Algorithms, New Regrets, and Applications}, author = {Liu, Zijian}, booktitle = {Proceedings of The 37th International Conference on Algorithmic Learning Theory}, pages = {1--47}, year = {2026}, editor = {Telgarsky, Matus and Ullman, Jonathan}, volume = {313}, series = {Proceedings of Machine Learning Research}, month = {23--26 Feb}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v313/main/assets/liu26a/liu26a.pdf}, url = {https://proceedings.mlr.press/v313/liu26a.html}, abstract = {In Online Convex Optimization (OCO), when the stochastic gradient has a finite variance, many algorithms provably work and guarantee a sublinear regret. However, limited results are known if the gradient estimate has a heavy tail, i.e., the stochastic gradient only admits a finite $\mathsf{p}$-th central moment for some $\mathsf{p}\in\left(1,2\right]$. Motivated by it, this work examines different old algorithms for OCO (e.g., Online Gradient Descent) in the more challenging heavy-tailed setting. Under the standard bounded domain assumption, we establish new regrets for these classical methods without any algorithmic modification. Remarkably, these regret bounds are fully optimal in all parameters (can be achieved even without knowing $\mathsf{p}$), suggesting that OCO with heavy tails can be solved effectively without any extra operation (e.g., gradient clipping). Our new results have several applications. A particularly interesting one is the first provable and optimal convergence result for nonsmooth nonconvex optimization under heavy-tailed noise without gradient clipping.} }
Endnote
%0 Conference Paper %T Online Convex Optimization with Heavy Tails: Old Algorithms, New Regrets, and Applications %A Zijian Liu %B Proceedings of The 37th International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2026 %E Matus Telgarsky %E Jonathan Ullman %F pmlr-v313-liu26a %I PMLR %P 1--47 %U https://proceedings.mlr.press/v313/liu26a.html %V 313 %X In Online Convex Optimization (OCO), when the stochastic gradient has a finite variance, many algorithms provably work and guarantee a sublinear regret. However, limited results are known if the gradient estimate has a heavy tail, i.e., the stochastic gradient only admits a finite $\mathsf{p}$-th central moment for some $\mathsf{p}\in\left(1,2\right]$. Motivated by it, this work examines different old algorithms for OCO (e.g., Online Gradient Descent) in the more challenging heavy-tailed setting. Under the standard bounded domain assumption, we establish new regrets for these classical methods without any algorithmic modification. Remarkably, these regret bounds are fully optimal in all parameters (can be achieved even without knowing $\mathsf{p}$), suggesting that OCO with heavy tails can be solved effectively without any extra operation (e.g., gradient clipping). Our new results have several applications. A particularly interesting one is the first provable and optimal convergence result for nonsmooth nonconvex optimization under heavy-tailed noise without gradient clipping.
APA
Liu, Z.. (2026). Online Convex Optimization with Heavy Tails: Old Algorithms, New Regrets, and Applications. Proceedings of The 37th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 313:1-47 Available from https://proceedings.mlr.press/v313/liu26a.html.

Related Material