Uniform Mean Estimation for Heavy-Tailed Distributions via Median-of-Means

Mikael Møller Høgsgaard, Andrea Paudice
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:23357-23381, 2025.

Abstract

The Median of Means (MoM) is a mean estimator that has gained popularity in the context of heavy-tailed data. In this work, we analyze its performance in the task of simultaneously estimating the mean of each function in a class $\mathcal{F}$ when the data distribution possesses only the first $p$ moments for $p \in (1,2]$. We prove a new sample complexity bound using a novel symmetrization technique that may be of independent interest. Additionally, we present applications of our result to $k$-means clustering with unbounded inputs and linear regression with general losses, improving upon existing works.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-hogsgaard25a, title = {Uniform Mean Estimation for Heavy-Tailed Distributions via Median-of-Means}, author = {H{\o}gsgaard, Mikael M{\o}ller and Paudice, Andrea}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {23357--23381}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/hogsgaard25a/hogsgaard25a.pdf}, url = {https://proceedings.mlr.press/v267/hogsgaard25a.html}, abstract = {The Median of Means (MoM) is a mean estimator that has gained popularity in the context of heavy-tailed data. In this work, we analyze its performance in the task of simultaneously estimating the mean of each function in a class $\mathcal{F}$ when the data distribution possesses only the first $p$ moments for $p \in (1,2]$. We prove a new sample complexity bound using a novel symmetrization technique that may be of independent interest. Additionally, we present applications of our result to $k$-means clustering with unbounded inputs and linear regression with general losses, improving upon existing works.} }
Endnote
%0 Conference Paper %T Uniform Mean Estimation for Heavy-Tailed Distributions via Median-of-Means %A Mikael Møller Høgsgaard %A Andrea Paudice %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-hogsgaard25a %I PMLR %P 23357--23381 %U https://proceedings.mlr.press/v267/hogsgaard25a.html %V 267 %X The Median of Means (MoM) is a mean estimator that has gained popularity in the context of heavy-tailed data. In this work, we analyze its performance in the task of simultaneously estimating the mean of each function in a class $\mathcal{F}$ when the data distribution possesses only the first $p$ moments for $p \in (1,2]$. We prove a new sample complexity bound using a novel symmetrization technique that may be of independent interest. Additionally, we present applications of our result to $k$-means clustering with unbounded inputs and linear regression with general losses, improving upon existing works.
APA
Høgsgaard, M.M. & Paudice, A.. (2025). Uniform Mean Estimation for Heavy-Tailed Distributions via Median-of-Means. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:23357-23381 Available from https://proceedings.mlr.press/v267/hogsgaard25a.html.

Related Material