Heavy-tailed Streaming Statistical Estimation

Che-Ping Tsai; Adarsh Prasad; Sivaraman Balakrishnan; Pradeep Ravikumar

Heavy-tailed Streaming Statistical Estimation

Che-Ping Tsai, Adarsh Prasad, Sivaraman Balakrishnan, Pradeep Ravikumar

Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:1251-1282, 2022.

Abstract

We consider the task of heavy-tailed statistical estimation given streaming $p$-dimensional samples. This could also be viewed as stochastic optimization under heavy-tailed distributions, with an additional $O(p)$ space complexity constraint. We design a clipped stochastic gradient descent algorithm and provide an improved analysis, under a more nuanced condition on the noise of the stochastic gradients, which we show is critical when analyzing stochastic optimization problems arising from general statistical estimation problems. Our results guarantee convergence not just in expectation but with exponential concentration, and moreover does so using $O(1)$ batch size. We provide consequences of our results for mean estimation and linear regression. Finally, we provide empirical corroboration of our results and algorithms via synthetic experiments for mean estimation and linear regression.

Cite this Paper

BibTeX


@InProceedings{pmlr-v151-tsai22a,
  title = 	 { Heavy-tailed Streaming Statistical Estimation },
  author =       {Tsai, Che-Ping and Prasad, Adarsh and Balakrishnan, Sivaraman and Ravikumar, Pradeep},
  booktitle = 	 {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1251--1282},
  year = 	 {2022},
  editor = 	 {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel},
  volume = 	 {151},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {28--30 Mar},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v151/tsai22a/tsai22a.pdf},
  url = 	 {https://proceedings.mlr.press/v151/tsai22a.html},
  abstract = 	 { We consider the task of heavy-tailed statistical estimation given streaming $p$-dimensional samples. This could also be viewed as stochastic optimization under heavy-tailed distributions, with an additional $O(p)$ space complexity constraint. We design a clipped stochastic gradient descent algorithm and provide an improved analysis, under a more nuanced condition on the noise of the stochastic gradients, which we show is critical when analyzing stochastic optimization problems arising from general statistical estimation problems. Our results guarantee convergence not just in expectation but with exponential concentration, and moreover does so using $O(1)$ batch size. We provide consequences of our results for mean estimation and linear regression. Finally, we provide empirical corroboration of our results and algorithms via synthetic experiments for mean estimation and linear regression. }
}

Endnote

%0 Conference Paper
%T  Heavy-tailed Streaming Statistical Estimation 
%A Che-Ping Tsai
%A Adarsh Prasad
%A Sivaraman Balakrishnan
%A Pradeep Ravikumar
%B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2022
%E Gustau Camps-Valls
%E Francisco J. R. Ruiz
%E Isabel Valera	
%F pmlr-v151-tsai22a
%I PMLR
%P 1251--1282
%U https://proceedings.mlr.press/v151/tsai22a.html
%V 151
%X  We consider the task of heavy-tailed statistical estimation given streaming $p$-dimensional samples. This could also be viewed as stochastic optimization under heavy-tailed distributions, with an additional $O(p)$ space complexity constraint. We design a clipped stochastic gradient descent algorithm and provide an improved analysis, under a more nuanced condition on the noise of the stochastic gradients, which we show is critical when analyzing stochastic optimization problems arising from general statistical estimation problems. Our results guarantee convergence not just in expectation but with exponential concentration, and moreover does so using $O(1)$ batch size. We provide consequences of our results for mean estimation and linear regression. Finally, we provide empirical corroboration of our results and algorithms via synthetic experiments for mean estimation and linear regression.

APA


Tsai, C., Prasad, A., Balakrishnan, S. & Ravikumar, P.. (2022).  Heavy-tailed Streaming Statistical Estimation . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:1251-1282 Available from https://proceedings.mlr.press/v151/tsai22a.html.

Related Material

Download PDF