Optimal Mean Estimation without a Variance

Yeshwanth Cherapanamjeri; Nilesh Tripuraneni; Peter Bartlett; Michael Jordan

Optimal Mean Estimation without a Variance

Yeshwanth Cherapanamjeri, Nilesh Tripuraneni, Peter Bartlett, Michael Jordan

Proceedings of Thirty Fifth Conference on Learning Theory, PMLR 178:356-357, 2022.

Abstract

We study the problem of heavy-tailed mean estimation in settings where the variance of the data-generating distribution does not exist. Concretely, given a sample

$\bm{X} = \{X_i\}_{i = 1}^n$ from a distribution

$\mc{D}$ over

$\mb{R}^d$ with mean

$\mu$ which satisfies the following \emph{weak-moment} assumption for some

${\alpha \in [0, 1]}$ :

$\begin{equation*} \forall \norm{v} = 1: \mb{E}_{X \ts \mc{D}}[\abs{\inp{X - \mu}{v}}^{1 + \alpha}] \leq 1, \end{equation*}$ and given a target failure probability,

$\delta$ , our goal is to design an estimator which attains the smallest possible confidence interval as a function of

$n,d,\delta$ . For the specific case of

$\alpha = 1$ , foundational work of Lugosi and Mendelson exhibits an estimator achieving \emph{optimal} subgaussian confidence intervals, and subsequent work has led to computationally efficient versions of this estimator. Here, we study the case of general

$\alpha$ , and provide a precise characterization of the optimal achievable confidence interval by establishing the following information-theoretic lower bound:

$\begin{equation*} \Omega \lprp{\sqrt{\frac{d}{n}} + \lprp{\frac{d}{n}}^{\frac{\alpha}{(1 + \alpha)}} + \lprp{\frac{\log 1 / \delta}{n}}^{\frac{\alpha}{(1 + \alpha)}}}. \end{equation*}$ and devising an estimator matching the aforementioned lower bound up to constants. Moreover, our estimator is computationally efficient.

Cite this Paper

BibTeX


@InProceedings{pmlr-v178-cherapanamjeri22a,
  title = 	 {Optimal Mean Estimation without a Variance},
  author =       {Cherapanamjeri, Yeshwanth and Tripuraneni, Nilesh and Bartlett, Peter and Jordan, Michael},
  booktitle = 	 {Proceedings of Thirty Fifth Conference on Learning Theory},
  pages = 	 {356--357},
  year = 	 {2022},
  editor = 	 {Loh, Po-Ling and Raginsky, Maxim},
  volume = 	 {178},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {02--05 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v178/cherapanamjeri22a/cherapanamjeri22a.pdf},
  url = 	 {https://proceedings.mlr.press/v178/cherapanamjeri22a.html},
  abstract = 	 {We study the problem of heavy-tailed mean estimation in settings where the variance of the data-generating distribution does not exist. Concretely, given a sample $\bm{X} = \{X_i\}_{i = 1}^n$ from a distribution $\mc{D}$ over $\mb{R}^d$ with mean $\mu$ which satisfies the following \emph{weak-moment} assumption for some ${\alpha \in [0, 1]}$: \begin{equation*} \forall \norm{v} = 1: \mb{E}_{X \ts \mc{D}}[\abs{\inp{X - \mu}{v}}^{1 + \alpha}] \leq 1, \end{equation*} and given a target failure probability, $\delta$, our goal is to design an estimator which attains the smallest possible confidence interval as a function of $n,d,\delta$. For the specific case of $\alpha = 1$, foundational work of Lugosi and Mendelson exhibits an estimator achieving \emph{optimal} subgaussian confidence intervals, and subsequent work has led to computationally efficient versions of this estimator. Here, we study the case of general $\alpha$, and provide a precise characterization of the optimal achievable confidence interval by establishing the following information-theoretic lower bound: \begin{equation*} \Omega \lprp{\sqrt{\frac{d}{n}} + \lprp{\frac{d}{n}}^{\frac{\alpha}{(1 + \alpha)}} + \lprp{\frac{\log 1 / \delta}{n}}^{\frac{\alpha}{(1 + \alpha)}}}. \end{equation*} and devising an estimator matching the aforementioned lower bound up to constants. Moreover, our estimator is computationally efficient.}
}

Endnote

%0 Conference Paper
%T Optimal Mean Estimation without a Variance
%A Yeshwanth Cherapanamjeri
%A Nilesh Tripuraneni
%A Peter Bartlett
%A Michael Jordan
%B Proceedings of Thirty Fifth Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2022
%E Po-Ling Loh
%E Maxim Raginsky	
%F pmlr-v178-cherapanamjeri22a
%I PMLR
%P 356--357
%U https://proceedings.mlr.press/v178/cherapanamjeri22a.html
%V 178
%X We study the problem of heavy-tailed mean estimation in settings where the variance of the data-generating distribution does not exist. Concretely, given a sample $\bm{X} = \{X_i\}_{i = 1}^n$ from a distribution $\mc{D}$ over $\mb{R}^d$ with mean $\mu$ which satisfies the following \emph{weak-moment} assumption for some ${\alpha \in [0, 1]}$: \begin{equation*} \forall \norm{v} = 1: \mb{E}_{X \ts \mc{D}}[\abs{\inp{X - \mu}{v}}^{1 + \alpha}] \leq 1, \end{equation*} and given a target failure probability, $\delta$, our goal is to design an estimator which attains the smallest possible confidence interval as a function of $n,d,\delta$. For the specific case of $\alpha = 1$, foundational work of Lugosi and Mendelson exhibits an estimator achieving \emph{optimal} subgaussian confidence intervals, and subsequent work has led to computationally efficient versions of this estimator. Here, we study the case of general $\alpha$, and provide a precise characterization of the optimal achievable confidence interval by establishing the following information-theoretic lower bound: \begin{equation*} \Omega \lprp{\sqrt{\frac{d}{n}} + \lprp{\frac{d}{n}}^{\frac{\alpha}{(1 + \alpha)}} + \lprp{\frac{\log 1 / \delta}{n}}^{\frac{\alpha}{(1 + \alpha)}}}. \end{equation*} and devising an estimator matching the aforementioned lower bound up to constants. Moreover, our estimator is computationally efficient.

APA


Cherapanamjeri, Y., Tripuraneni, N., Bartlett, P. & Jordan, M.. (2022). Optimal Mean Estimation without a Variance. Proceedings of Thirty Fifth Conference on Learning Theory, in Proceedings of Machine Learning Research 178:356-357 Available from https://proceedings.mlr.press/v178/cherapanamjeri22a.html.

Related Material

Download PDF