Mirror Descent Strikes Again: Optimal Stochastic Convex Optimization under Infinite Noise Variance

Nuri Mert Vural, Lu Yu, Krishna Balasubramanian, Stanislav Volgushev, Murat A Erdogdu
Proceedings of Thirty Fifth Conference on Learning Theory, PMLR 178:65-102, 2022.

Abstract

We study stochastic convex optimization under infinite noise variance. Specifically, when the stochastic gradient is unbiased and has uniformly bounded $(1+\kappa)$-th moment, for some $\kappa \in (0,1]$, we quantify the convergence rate of the Stochastic Mirror Descent algorithm with a particular class of uniformly convex mirror maps, in terms of the number of iterations, dimensionality and related geometric parameters of the optimization problem. Interestingly this algorithm does not require any explicit gradient clipping or normalization, which have been extensively used in several recent empirical and theoretical works. We complement our convergence results with information-theoretic lower bounds showing that no other algorithm using only stochastic first-order oracles can achieve improved rates. Our results have several interesting consequences for devising online/streaming stochastic approximation algorithms for problems arising in robust statistics and machine learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v178-vural22a, title = {Mirror Descent Strikes Again: Optimal Stochastic Convex Optimization under Infinite Noise Variance}, author = {Vural, Nuri Mert and Yu, Lu and Balasubramanian, Krishna and Volgushev, Stanislav and Erdogdu, Murat A}, booktitle = {Proceedings of Thirty Fifth Conference on Learning Theory}, pages = {65--102}, year = {2022}, editor = {Loh, Po-Ling and Raginsky, Maxim}, volume = {178}, series = {Proceedings of Machine Learning Research}, month = {02--05 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v178/vural22a/vural22a.pdf}, url = {https://proceedings.mlr.press/v178/vural22a.html}, abstract = {We study stochastic convex optimization under infinite noise variance. Specifically, when the stochastic gradient is unbiased and has uniformly bounded $(1+\kappa)$-th moment, for some $\kappa \in (0,1]$, we quantify the convergence rate of the Stochastic Mirror Descent algorithm with a particular class of uniformly convex mirror maps, in terms of the number of iterations, dimensionality and related geometric parameters of the optimization problem. Interestingly this algorithm does not require any explicit gradient clipping or normalization, which have been extensively used in several recent empirical and theoretical works. We complement our convergence results with information-theoretic lower bounds showing that no other algorithm using only stochastic first-order oracles can achieve improved rates. Our results have several interesting consequences for devising online/streaming stochastic approximation algorithms for problems arising in robust statistics and machine learning.} }
Endnote
%0 Conference Paper %T Mirror Descent Strikes Again: Optimal Stochastic Convex Optimization under Infinite Noise Variance %A Nuri Mert Vural %A Lu Yu %A Krishna Balasubramanian %A Stanislav Volgushev %A Murat A Erdogdu %B Proceedings of Thirty Fifth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2022 %E Po-Ling Loh %E Maxim Raginsky %F pmlr-v178-vural22a %I PMLR %P 65--102 %U https://proceedings.mlr.press/v178/vural22a.html %V 178 %X We study stochastic convex optimization under infinite noise variance. Specifically, when the stochastic gradient is unbiased and has uniformly bounded $(1+\kappa)$-th moment, for some $\kappa \in (0,1]$, we quantify the convergence rate of the Stochastic Mirror Descent algorithm with a particular class of uniformly convex mirror maps, in terms of the number of iterations, dimensionality and related geometric parameters of the optimization problem. Interestingly this algorithm does not require any explicit gradient clipping or normalization, which have been extensively used in several recent empirical and theoretical works. We complement our convergence results with information-theoretic lower bounds showing that no other algorithm using only stochastic first-order oracles can achieve improved rates. Our results have several interesting consequences for devising online/streaming stochastic approximation algorithms for problems arising in robust statistics and machine learning.
APA
Vural, N.M., Yu, L., Balasubramanian, K., Volgushev, S. & Erdogdu, M.A.. (2022). Mirror Descent Strikes Again: Optimal Stochastic Convex Optimization under Infinite Noise Variance. Proceedings of Thirty Fifth Conference on Learning Theory, in Proceedings of Machine Learning Research 178:65-102 Available from https://proceedings.mlr.press/v178/vural22a.html.

Related Material