A Statistical Perspective on Coreset Density Estimation

Paxton Turner, Jingbo Liu, Philippe Rigollet
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:2512-2520, 2021.

Abstract

Coresets have emerged as a powerful tool to summarize data by selecting a small subset of the original observations while retaining most of its information. This approach has led to significant computational speedups but the performance of statistical procedures run on coresets is largely unexplored. In this work, we develop a statistical framework to study coresets and focus on the canonical task of nonparameteric density estimation. Our contributions are twofold. First, we establish the minimax rate of estimation achievable by coreset-based estimators. Second, we show that the practical coreset kernel density estimators are near-minimax optimal over a large class of Holder-smooth densities.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-turner21b, title = { A Statistical Perspective on Coreset Density Estimation }, author = {Turner, Paxton and Liu, Jingbo and Rigollet, Philippe}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {2512--2520}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/turner21b/turner21b.pdf}, url = {https://proceedings.mlr.press/v130/turner21b.html}, abstract = { Coresets have emerged as a powerful tool to summarize data by selecting a small subset of the original observations while retaining most of its information. This approach has led to significant computational speedups but the performance of statistical procedures run on coresets is largely unexplored. In this work, we develop a statistical framework to study coresets and focus on the canonical task of nonparameteric density estimation. Our contributions are twofold. First, we establish the minimax rate of estimation achievable by coreset-based estimators. Second, we show that the practical coreset kernel density estimators are near-minimax optimal over a large class of Holder-smooth densities. } }
Endnote
%0 Conference Paper %T A Statistical Perspective on Coreset Density Estimation %A Paxton Turner %A Jingbo Liu %A Philippe Rigollet %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-turner21b %I PMLR %P 2512--2520 %U https://proceedings.mlr.press/v130/turner21b.html %V 130 %X Coresets have emerged as a powerful tool to summarize data by selecting a small subset of the original observations while retaining most of its information. This approach has led to significant computational speedups but the performance of statistical procedures run on coresets is largely unexplored. In this work, we develop a statistical framework to study coresets and focus on the canonical task of nonparameteric density estimation. Our contributions are twofold. First, we establish the minimax rate of estimation achievable by coreset-based estimators. Second, we show that the practical coreset kernel density estimators are near-minimax optimal over a large class of Holder-smooth densities.
APA
Turner, P., Liu, J. & Rigollet, P.. (2021). A Statistical Perspective on Coreset Density Estimation . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:2512-2520 Available from https://proceedings.mlr.press/v130/turner21b.html.

Related Material