All-Purpose Mean Estimation over R: Optimal Sub-Gaussianity with Outlier Robustness and Low Moments Performance

Jasper C.H. Lee, Walter Mckelvie, Maoyuan Song, Paul Valiant
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:33435-33475, 2025.

Abstract

We consider the basic statistical challenge of designing an "all-purpose" mean estimation algorithm that is recommendable across a variety of settings and models. Recent work by [Lee and Valiant 2022] introduced the first 1-d mean estimator whose error in the standard finite-variance+i.i.d. setting is optimal even in its constant factors; experimental demonstration of its good performance was shown by [Gobet et al. 2022]. Yet, unlike for classic (but not necessarily practical) estimators such as median-of-means and trimmed mean, this new algorithm lacked proven robustness guarantees in other settings, including the settings of adversarial data corruption and heavy-tailed distributions with infinite variance. Such robustness is important for practical use cases. This raises a research question: is it possible to have a mean estimator that is robust, without sacrificing provably optimal performance in the standard i.i.d. setting? In this work, we show that Lee and Valiant’s estimator is in fact an "all-purpose" mean estimator by proving: (A) It is robust to an $\eta$-fraction of data corruption, even in the strong contamination model; it has optimal estimation error $O(\sigma\sqrt{\eta})$ for distributions with variance $\sigma^2$. (B) For distributions with finite $z^\text{th}$ moment, for $z \in (1,2)$, it has optimal estimation error, matching the lower bounds of [Devroye et al. 2016] up to constants. We further show (C) that outlier robustness for 1-d mean estimators in fact implies neighborhood optimality, a notion of beyond worst-case and distribution-dependent optimality recently introduced by [Dang et al. 2023]. Previously, such an optimality guarantee was only known for median-of-means, but now it holds also for all estimators that are simultaneously robust and sub-Gaussian, including Lee and Valiant’s, resolving a question raised by Dang et al. Lastly, we show (D) the asymptotic normality and efficiency of Lee and Valiant’s estimator, as further evidence for its performance across many settings.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-lee25w, title = {All-Purpose Mean Estimation over R: Optimal Sub-Gaussianity with Outlier Robustness and Low Moments Performance}, author = {Lee, Jasper C.H. and Mckelvie, Walter and Song, Maoyuan and Valiant, Paul}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {33435--33475}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/lee25w/lee25w.pdf}, url = {https://proceedings.mlr.press/v267/lee25w.html}, abstract = {We consider the basic statistical challenge of designing an "all-purpose" mean estimation algorithm that is recommendable across a variety of settings and models. Recent work by [Lee and Valiant 2022] introduced the first 1-d mean estimator whose error in the standard finite-variance+i.i.d. setting is optimal even in its constant factors; experimental demonstration of its good performance was shown by [Gobet et al. 2022]. Yet, unlike for classic (but not necessarily practical) estimators such as median-of-means and trimmed mean, this new algorithm lacked proven robustness guarantees in other settings, including the settings of adversarial data corruption and heavy-tailed distributions with infinite variance. Such robustness is important for practical use cases. This raises a research question: is it possible to have a mean estimator that is robust, without sacrificing provably optimal performance in the standard i.i.d. setting? In this work, we show that Lee and Valiant’s estimator is in fact an "all-purpose" mean estimator by proving: (A) It is robust to an $\eta$-fraction of data corruption, even in the strong contamination model; it has optimal estimation error $O(\sigma\sqrt{\eta})$ for distributions with variance $\sigma^2$. (B) For distributions with finite $z^\text{th}$ moment, for $z \in (1,2)$, it has optimal estimation error, matching the lower bounds of [Devroye et al. 2016] up to constants. We further show (C) that outlier robustness for 1-d mean estimators in fact implies neighborhood optimality, a notion of beyond worst-case and distribution-dependent optimality recently introduced by [Dang et al. 2023]. Previously, such an optimality guarantee was only known for median-of-means, but now it holds also for all estimators that are simultaneously robust and sub-Gaussian, including Lee and Valiant’s, resolving a question raised by Dang et al. Lastly, we show (D) the asymptotic normality and efficiency of Lee and Valiant’s estimator, as further evidence for its performance across many settings.} }
Endnote
%0 Conference Paper %T All-Purpose Mean Estimation over R: Optimal Sub-Gaussianity with Outlier Robustness and Low Moments Performance %A Jasper C.H. Lee %A Walter Mckelvie %A Maoyuan Song %A Paul Valiant %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-lee25w %I PMLR %P 33435--33475 %U https://proceedings.mlr.press/v267/lee25w.html %V 267 %X We consider the basic statistical challenge of designing an "all-purpose" mean estimation algorithm that is recommendable across a variety of settings and models. Recent work by [Lee and Valiant 2022] introduced the first 1-d mean estimator whose error in the standard finite-variance+i.i.d. setting is optimal even in its constant factors; experimental demonstration of its good performance was shown by [Gobet et al. 2022]. Yet, unlike for classic (but not necessarily practical) estimators such as median-of-means and trimmed mean, this new algorithm lacked proven robustness guarantees in other settings, including the settings of adversarial data corruption and heavy-tailed distributions with infinite variance. Such robustness is important for practical use cases. This raises a research question: is it possible to have a mean estimator that is robust, without sacrificing provably optimal performance in the standard i.i.d. setting? In this work, we show that Lee and Valiant’s estimator is in fact an "all-purpose" mean estimator by proving: (A) It is robust to an $\eta$-fraction of data corruption, even in the strong contamination model; it has optimal estimation error $O(\sigma\sqrt{\eta})$ for distributions with variance $\sigma^2$. (B) For distributions with finite $z^\text{th}$ moment, for $z \in (1,2)$, it has optimal estimation error, matching the lower bounds of [Devroye et al. 2016] up to constants. We further show (C) that outlier robustness for 1-d mean estimators in fact implies neighborhood optimality, a notion of beyond worst-case and distribution-dependent optimality recently introduced by [Dang et al. 2023]. Previously, such an optimality guarantee was only known for median-of-means, but now it holds also for all estimators that are simultaneously robust and sub-Gaussian, including Lee and Valiant’s, resolving a question raised by Dang et al. Lastly, we show (D) the asymptotic normality and efficiency of Lee and Valiant’s estimator, as further evidence for its performance across many settings.
APA
Lee, J.C., Mckelvie, W., Song, M. & Valiant, P.. (2025). All-Purpose Mean Estimation over R: Optimal Sub-Gaussianity with Outlier Robustness and Low Moments Performance. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:33435-33475 Available from https://proceedings.mlr.press/v267/lee25w.html.

Related Material