Towards Practical Mean Bounds for Small Samples

My Phan; Philip Thomas; Erik Learned-Miller

Towards Practical Mean Bounds for Small Samples

My Phan, Philip Thomas, Erik Learned-Miller

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:8567-8576, 2021.

Abstract

Historically, to bound the mean for small sample sizes, practitioners have had to choose between using methods with unrealistic assumptions about the unknown distribution (e.g., Gaussianity) and methods like Hoeffding’s inequality that use weaker assumptions but produce much looser (wider) intervals. In 1969, \citet{Anderson1969} proposed a mean confidence interval strictly better than or equal to Hoeffding’s whose only assumption is that the distribution’s support is contained in an interval $[a,b]$. For the first time since then, we present a new family of bounds that compares favorably to Anderson’s. We prove that each bound in the family has {\em guaranteed coverage}, i.e., it holds with probability at least $1-\alpha$ for all distributions on an interval $[a,b]$. Furthermore, one of the bounds is tighter than or equal to Anderson’s for all samples. In simulations, we show that for many distributions, the gain over Anderson’s bound is substantial.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-phan21a,
  title = 	 {Towards Practical Mean Bounds for Small Samples},
  author =       {Phan, My and Thomas, Philip and Learned-Miller, Erik},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {8567--8576},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/phan21a/phan21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/phan21a.html},
  abstract = 	 {Historically, to bound the mean for small sample sizes, practitioners have had to choose between using methods with unrealistic assumptions about the unknown distribution (e.g., Gaussianity) and methods like Hoeffding’s inequality that use weaker assumptions but produce much looser (wider) intervals. In 1969, \citet{Anderson1969} proposed a mean confidence interval strictly better than or equal to Hoeffding’s whose only assumption is that the distribution’s support is contained in an interval $[a,b]$. For the first time since then, we present a new family of bounds that compares favorably to Anderson’s. We prove that each bound in the family has {\em guaranteed coverage}, i.e., it holds with probability at least $1-\alpha$ for all distributions on an interval $[a,b]$. Furthermore, one of the bounds is tighter than or equal to Anderson’s for all samples. In simulations, we show that for many distributions, the gain over Anderson’s bound is substantial.}
}

Endnote

%0 Conference Paper
%T Towards Practical Mean Bounds for Small Samples
%A My Phan
%A Philip Thomas
%A Erik Learned-Miller
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-phan21a
%I PMLR
%P 8567--8576
%U https://proceedings.mlr.press/v139/phan21a.html
%V 139
%X Historically, to bound the mean for small sample sizes, practitioners have had to choose between using methods with unrealistic assumptions about the unknown distribution (e.g., Gaussianity) and methods like Hoeffding’s inequality that use weaker assumptions but produce much looser (wider) intervals. In 1969, \citet{Anderson1969} proposed a mean confidence interval strictly better than or equal to Hoeffding’s whose only assumption is that the distribution’s support is contained in an interval $[a,b]$. For the first time since then, we present a new family of bounds that compares favorably to Anderson’s. We prove that each bound in the family has {\em guaranteed coverage}, i.e., it holds with probability at least $1-\alpha$ for all distributions on an interval $[a,b]$. Furthermore, one of the bounds is tighter than or equal to Anderson’s for all samples. In simulations, we show that for many distributions, the gain over Anderson’s bound is substantial.

APA

Phan, M., Thomas, P. & Learned-Miller, E.. (2021). Towards Practical Mean Bounds for Small Samples. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:8567-8576 Available from https://proceedings.mlr.press/v139/phan21a.html.

Towards Practical Mean Bounds for Small Samples

Abstract

Cite this Paper

Related Material