Doubly-Competitive Distribution Estimation

Yi Hao, Alon Orlitsky
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:2614-2623, 2019.

Abstract

Distribution estimation is a statistical-learning cornerstone. Its classical min-max formulation minimizes the estimation error for the worst distribution, hence under-performs for practical distributions that, like power-law, are often rather simple. Modern research has therefore focused on two frameworks: structural estimation that improves learning accuracy by assuming a simple structure of the underlying distribution; and competitive, or instance-optimal, estimation that achieves the performance of a genie aided estimator up to a small excess error that vanishes as the sample size grows, regardless of the distribution. This paper combines and strengthens the two frameworks. It designs a single estimator whose excess error vanishes both at a universal rate as the sample size grows, as well as when the (unknown) distribution gets simpler. We show that the resulting algorithm significantly improves the performance guarantees for numerous competitive- and structural-estimation results. The algorithm runs in near-linear time and is robust to model misspecification and domain-symbol permutations.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-hao19a, title = {Doubly-Competitive Distribution Estimation}, author = {Hao, Yi and Orlitsky, Alon}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {2614--2623}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/hao19a/hao19a.pdf}, url = {https://proceedings.mlr.press/v97/hao19a.html}, abstract = {Distribution estimation is a statistical-learning cornerstone. Its classical min-max formulation minimizes the estimation error for the worst distribution, hence under-performs for practical distributions that, like power-law, are often rather simple. Modern research has therefore focused on two frameworks: structural estimation that improves learning accuracy by assuming a simple structure of the underlying distribution; and competitive, or instance-optimal, estimation that achieves the performance of a genie aided estimator up to a small excess error that vanishes as the sample size grows, regardless of the distribution. This paper combines and strengthens the two frameworks. It designs a single estimator whose excess error vanishes both at a universal rate as the sample size grows, as well as when the (unknown) distribution gets simpler. We show that the resulting algorithm significantly improves the performance guarantees for numerous competitive- and structural-estimation results. The algorithm runs in near-linear time and is robust to model misspecification and domain-symbol permutations.} }
Endnote
%0 Conference Paper %T Doubly-Competitive Distribution Estimation %A Yi Hao %A Alon Orlitsky %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-hao19a %I PMLR %P 2614--2623 %U https://proceedings.mlr.press/v97/hao19a.html %V 97 %X Distribution estimation is a statistical-learning cornerstone. Its classical min-max formulation minimizes the estimation error for the worst distribution, hence under-performs for practical distributions that, like power-law, are often rather simple. Modern research has therefore focused on two frameworks: structural estimation that improves learning accuracy by assuming a simple structure of the underlying distribution; and competitive, or instance-optimal, estimation that achieves the performance of a genie aided estimator up to a small excess error that vanishes as the sample size grows, regardless of the distribution. This paper combines and strengthens the two frameworks. It designs a single estimator whose excess error vanishes both at a universal rate as the sample size grows, as well as when the (unknown) distribution gets simpler. We show that the resulting algorithm significantly improves the performance guarantees for numerous competitive- and structural-estimation results. The algorithm runs in near-linear time and is robust to model misspecification and domain-symbol permutations.
APA
Hao, Y. & Orlitsky, A.. (2019). Doubly-Competitive Distribution Estimation. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:2614-2623 Available from https://proceedings.mlr.press/v97/hao19a.html.

Related Material