Uniform Convergence Rate of the Kernel Density Estimator Adaptive to Intrinsic Volume Dimension

Jisu Kim, Jaehyeok Shin, Alessandro Rinaldo, Larry Wasserman
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:3398-3407, 2019.

Abstract

We derive concentration inequalities for the supremum norm of the difference between a kernel density estimator (KDE) and its point-wise expectation that hold uniformly over the selection of the bandwidth and under weaker conditions on the kernel and the data generating distribution than previously used in the literature. We first propose a novel concept, called the volume dimension, to measure the intrinsic dimension of the support of a probability distribution based on the rates of decay of the probability of vanishing Euclidean balls. Our bounds depend on the volume dimension and generalize the existing bounds derived in the literature. In particular, when the data-generating distribution has a bounded Lebesgue density or is supported on a sufficiently well-behaved lower-dimensional manifold, our bound recovers the same convergence rate depending on the intrinsic dimension of the support as ones known in the literature. At the same time, our results apply to more general cases, such as the ones of distribution with unbounded densities or supported on a mixture of manifolds with different dimensions. Analogous bounds are derived for the derivative of the KDE, of any order. Our results are generally applicable but are especially useful for problems in geometric inference and topological data analysis, including level set estimation, density-based clustering, modal clustering and mode hunting, ridge estimation and persistent homology.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-kim19e, title = {Uniform Convergence Rate of the Kernel Density Estimator Adaptive to Intrinsic Volume Dimension}, author = {Kim, Jisu and Shin, Jaehyeok and Rinaldo, Alessandro and Wasserman, Larry}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {3398--3407}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/kim19e/kim19e.pdf}, url = {https://proceedings.mlr.press/v97/kim19e.html}, abstract = {We derive concentration inequalities for the supremum norm of the difference between a kernel density estimator (KDE) and its point-wise expectation that hold uniformly over the selection of the bandwidth and under weaker conditions on the kernel and the data generating distribution than previously used in the literature. We first propose a novel concept, called the volume dimension, to measure the intrinsic dimension of the support of a probability distribution based on the rates of decay of the probability of vanishing Euclidean balls. Our bounds depend on the volume dimension and generalize the existing bounds derived in the literature. In particular, when the data-generating distribution has a bounded Lebesgue density or is supported on a sufficiently well-behaved lower-dimensional manifold, our bound recovers the same convergence rate depending on the intrinsic dimension of the support as ones known in the literature. At the same time, our results apply to more general cases, such as the ones of distribution with unbounded densities or supported on a mixture of manifolds with different dimensions. Analogous bounds are derived for the derivative of the KDE, of any order. Our results are generally applicable but are especially useful for problems in geometric inference and topological data analysis, including level set estimation, density-based clustering, modal clustering and mode hunting, ridge estimation and persistent homology.} }
Endnote
%0 Conference Paper %T Uniform Convergence Rate of the Kernel Density Estimator Adaptive to Intrinsic Volume Dimension %A Jisu Kim %A Jaehyeok Shin %A Alessandro Rinaldo %A Larry Wasserman %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-kim19e %I PMLR %P 3398--3407 %U https://proceedings.mlr.press/v97/kim19e.html %V 97 %X We derive concentration inequalities for the supremum norm of the difference between a kernel density estimator (KDE) and its point-wise expectation that hold uniformly over the selection of the bandwidth and under weaker conditions on the kernel and the data generating distribution than previously used in the literature. We first propose a novel concept, called the volume dimension, to measure the intrinsic dimension of the support of a probability distribution based on the rates of decay of the probability of vanishing Euclidean balls. Our bounds depend on the volume dimension and generalize the existing bounds derived in the literature. In particular, when the data-generating distribution has a bounded Lebesgue density or is supported on a sufficiently well-behaved lower-dimensional manifold, our bound recovers the same convergence rate depending on the intrinsic dimension of the support as ones known in the literature. At the same time, our results apply to more general cases, such as the ones of distribution with unbounded densities or supported on a mixture of manifolds with different dimensions. Analogous bounds are derived for the derivative of the KDE, of any order. Our results are generally applicable but are especially useful for problems in geometric inference and topological data analysis, including level set estimation, density-based clustering, modal clustering and mode hunting, ridge estimation and persistent homology.
APA
Kim, J., Shin, J., Rinaldo, A. & Wasserman, L.. (2019). Uniform Convergence Rate of the Kernel Density Estimator Adaptive to Intrinsic Volume Dimension. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:3398-3407 Available from https://proceedings.mlr.press/v97/kim19e.html.

Related Material