An Empirical Bayes Perspective on Heteroskedastic Mean Estimation

Yanjun Han, Abhishek Shetty, Jacob Shkrob
Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:3076-3108, 2026.

Abstract

Towards understanding the fundamental limits of estimation from data of varied quality, we study the problem of estimating a mean parameter from heteroskedastic Gaussian observations where the variances are unknown and may vary across observations. While, with known variances, a simple linear estimator attains the smallest mean squared error, estimation without this knowledge is challenging due to the large number of nuisance parameters. We propose a simple and principled approach based on empirical Bayes: model the observations as if they were i.i.d. from a normal scale mixture and compute the profile maximum likelihood estimator (MLE) for the mean, treating the nonparametric mixing distribution as nuisance. Our result shows that this estimator achieves near-optimal error bounds across various heteroskedastic models in the literature. In particular, for the subset-of-signals problem where an unknown subset of observations has small variance, our estimator adaptively achieves the minimax rate for all signal sizes, including the sharp phase transition, without any tuning parameters. One of our key technical steps is a sharper metric entropy bound for normal scale mixtures, obtained via generalized moment matching and Chebyshev approximation. This approach yields an improved polylogarithmic, rather than polynomial, dependence on problem parameters, which could be of independent interest.

Cite this Paper


BibTeX
@InProceedings{pmlr-v336-han26a, title = {An Empirical {Bayes} Perspective on Heteroskedastic Mean Estimation}, author = {Han, Yanjun and Shetty, Abhishek and Shkrob, Jacob}, booktitle = {Proceedings of Thirty Ninth Conference on Learning Theory}, pages = {3076--3108}, year = {2026}, editor = {Hanneke, Steve and Lattimore, Tor}, volume = {336}, series = {Proceedings of Machine Learning Research}, month = {29 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v336/main/assets/han26a/han26a.pdf}, url = {https://proceedings.mlr.press/v336/han26a.html}, abstract = {Towards understanding the fundamental limits of estimation from data of varied quality, we study the problem of estimating a mean parameter from heteroskedastic Gaussian observations where the variances are unknown and may vary across observations. While, with known variances, a simple linear estimator attains the smallest mean squared error, estimation without this knowledge is challenging due to the large number of nuisance parameters. We propose a simple and principled approach based on empirical Bayes: model the observations as if they were i.i.d. from a normal scale mixture and compute the profile maximum likelihood estimator (MLE) for the mean, treating the nonparametric mixing distribution as nuisance. Our result shows that this estimator achieves near-optimal error bounds across various heteroskedastic models in the literature. In particular, for the subset-of-signals problem where an unknown subset of observations has small variance, our estimator adaptively achieves the minimax rate for all signal sizes, including the sharp phase transition, without any tuning parameters. One of our key technical steps is a sharper metric entropy bound for normal scale mixtures, obtained via generalized moment matching and Chebyshev approximation. This approach yields an improved polylogarithmic, rather than polynomial, dependence on problem parameters, which could be of independent interest.} }
Endnote
%0 Conference Paper %T An Empirical Bayes Perspective on Heteroskedastic Mean Estimation %A Yanjun Han %A Abhishek Shetty %A Jacob Shkrob %B Proceedings of Thirty Ninth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2026 %E Steve Hanneke %E Tor Lattimore %F pmlr-v336-han26a %I PMLR %P 3076--3108 %U https://proceedings.mlr.press/v336/han26a.html %V 336 %X Towards understanding the fundamental limits of estimation from data of varied quality, we study the problem of estimating a mean parameter from heteroskedastic Gaussian observations where the variances are unknown and may vary across observations. While, with known variances, a simple linear estimator attains the smallest mean squared error, estimation without this knowledge is challenging due to the large number of nuisance parameters. We propose a simple and principled approach based on empirical Bayes: model the observations as if they were i.i.d. from a normal scale mixture and compute the profile maximum likelihood estimator (MLE) for the mean, treating the nonparametric mixing distribution as nuisance. Our result shows that this estimator achieves near-optimal error bounds across various heteroskedastic models in the literature. In particular, for the subset-of-signals problem where an unknown subset of observations has small variance, our estimator adaptively achieves the minimax rate for all signal sizes, including the sharp phase transition, without any tuning parameters. One of our key technical steps is a sharper metric entropy bound for normal scale mixtures, obtained via generalized moment matching and Chebyshev approximation. This approach yields an improved polylogarithmic, rather than polynomial, dependence on problem parameters, which could be of independent interest.
APA
Han, Y., Shetty, A. & Shkrob, J.. (2026). An Empirical Bayes Perspective on Heteroskedastic Mean Estimation. Proceedings of Thirty Ninth Conference on Learning Theory, in Proceedings of Machine Learning Research 336:3076-3108 Available from https://proceedings.mlr.press/v336/han26a.html.

Related Material