On the estimation of persistence intensity functions and linear representations of persistence diagrams

Weichen Wu, Jisu Kim, Alessandro Rinaldo
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:3610-3618, 2024.

Abstract

Persistence diagrams are one of the most popular types of data summaries used in Topological Data Analysis. The prevailing statistical approach to analyzing persistence diagrams is concerned with filtering out topological noise. In this paper, we adopt a different viewpoint and aim at estimating the actual distribution of a random persistence diagram, which captures both topological signal and noise. To that effect, Chazel et al., (2018) proved that, under general conditions, the expected value of a random persistence diagram is a measure admitting a Lebesgue density, called the persistence intensity function. In this paper, we are concerned with estimating the persistence intensity function and a novel, normalized version of it – called the persistence density function. We present a class of kernel-based estimators based on an i.i.d. sample of persistence diagrams and derive estimation rates in the supremum norm. As a direct corollary, we obtain uniform consistency rates for estimating linear representations of persistence diagrams, including Betti numbers and persistence surfaces. Interestingly, the persistence density function delivers stronger statistical guarantees.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-wu24f, title = {On the estimation of persistence intensity functions and linear representations of persistence diagrams}, author = {Wu, Weichen and Kim, Jisu and Rinaldo, Alessandro}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {3610--3618}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/wu24f/wu24f.pdf}, url = {https://proceedings.mlr.press/v238/wu24f.html}, abstract = {Persistence diagrams are one of the most popular types of data summaries used in Topological Data Analysis. The prevailing statistical approach to analyzing persistence diagrams is concerned with filtering out topological noise. In this paper, we adopt a different viewpoint and aim at estimating the actual distribution of a random persistence diagram, which captures both topological signal and noise. To that effect, Chazel et al., (2018) proved that, under general conditions, the expected value of a random persistence diagram is a measure admitting a Lebesgue density, called the persistence intensity function. In this paper, we are concerned with estimating the persistence intensity function and a novel, normalized version of it – called the persistence density function. We present a class of kernel-based estimators based on an i.i.d. sample of persistence diagrams and derive estimation rates in the supremum norm. As a direct corollary, we obtain uniform consistency rates for estimating linear representations of persistence diagrams, including Betti numbers and persistence surfaces. Interestingly, the persistence density function delivers stronger statistical guarantees.} }
Endnote
%0 Conference Paper %T On the estimation of persistence intensity functions and linear representations of persistence diagrams %A Weichen Wu %A Jisu Kim %A Alessandro Rinaldo %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-wu24f %I PMLR %P 3610--3618 %U https://proceedings.mlr.press/v238/wu24f.html %V 238 %X Persistence diagrams are one of the most popular types of data summaries used in Topological Data Analysis. The prevailing statistical approach to analyzing persistence diagrams is concerned with filtering out topological noise. In this paper, we adopt a different viewpoint and aim at estimating the actual distribution of a random persistence diagram, which captures both topological signal and noise. To that effect, Chazel et al., (2018) proved that, under general conditions, the expected value of a random persistence diagram is a measure admitting a Lebesgue density, called the persistence intensity function. In this paper, we are concerned with estimating the persistence intensity function and a novel, normalized version of it – called the persistence density function. We present a class of kernel-based estimators based on an i.i.d. sample of persistence diagrams and derive estimation rates in the supremum norm. As a direct corollary, we obtain uniform consistency rates for estimating linear representations of persistence diagrams, including Betti numbers and persistence surfaces. Interestingly, the persistence density function delivers stronger statistical guarantees.
APA
Wu, W., Kim, J. & Rinaldo, A.. (2024). On the estimation of persistence intensity functions and linear representations of persistence diagrams. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:3610-3618 Available from https://proceedings.mlr.press/v238/wu24f.html.

Related Material