Cover learning for large-scale topology representation

Luis Scoccola, Uzu Lim, Heather A. Harrington
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:53728-53756, 2025.

Abstract

Classical unsupervised learning methods like clustering and linear dimensionality reduction parametrize large-scale geometry when it is discrete or linear, while more modern methods from manifold learning find low dimensional representation or infer local geometry by constructing a graph on the input data. More recently, topological data analysis popularized the use of simplicial complexes to represent data topology with two main methodologies: topological inference with geometric complexes and large-scale topology representation with Mapper graphs – central to these is the nerve construction from topology, which builds a simplicial complex given any cover of a space by subsets. While successful, these have limitations: geometric complexes scale poorly with data size, and Mapper graphs can be hard to tune and only contain low dimensional information. In this paper, we propose to study the problem of learning covers in its own right, and from the perspective of optimization. We describe a method to learn topologically-faithful covers of geometric datasets, and show that the simplicial complexes thus obtained can outperform standard topological inference approaches in terms of size, and Mapper-type algorithms in terms of representation of large-scale topology.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-scoccola25a, title = {Cover learning for large-scale topology representation}, author = {Scoccola, Luis and Lim, Uzu and Harrington, Heather A.}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {53728--53756}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/scoccola25a/scoccola25a.pdf}, url = {https://proceedings.mlr.press/v267/scoccola25a.html}, abstract = {Classical unsupervised learning methods like clustering and linear dimensionality reduction parametrize large-scale geometry when it is discrete or linear, while more modern methods from manifold learning find low dimensional representation or infer local geometry by constructing a graph on the input data. More recently, topological data analysis popularized the use of simplicial complexes to represent data topology with two main methodologies: topological inference with geometric complexes and large-scale topology representation with Mapper graphs – central to these is the nerve construction from topology, which builds a simplicial complex given any cover of a space by subsets. While successful, these have limitations: geometric complexes scale poorly with data size, and Mapper graphs can be hard to tune and only contain low dimensional information. In this paper, we propose to study the problem of learning covers in its own right, and from the perspective of optimization. We describe a method to learn topologically-faithful covers of geometric datasets, and show that the simplicial complexes thus obtained can outperform standard topological inference approaches in terms of size, and Mapper-type algorithms in terms of representation of large-scale topology.} }
Endnote
%0 Conference Paper %T Cover learning for large-scale topology representation %A Luis Scoccola %A Uzu Lim %A Heather A. Harrington %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-scoccola25a %I PMLR %P 53728--53756 %U https://proceedings.mlr.press/v267/scoccola25a.html %V 267 %X Classical unsupervised learning methods like clustering and linear dimensionality reduction parametrize large-scale geometry when it is discrete or linear, while more modern methods from manifold learning find low dimensional representation or infer local geometry by constructing a graph on the input data. More recently, topological data analysis popularized the use of simplicial complexes to represent data topology with two main methodologies: topological inference with geometric complexes and large-scale topology representation with Mapper graphs – central to these is the nerve construction from topology, which builds a simplicial complex given any cover of a space by subsets. While successful, these have limitations: geometric complexes scale poorly with data size, and Mapper graphs can be hard to tune and only contain low dimensional information. In this paper, we propose to study the problem of learning covers in its own right, and from the perspective of optimization. We describe a method to learn topologically-faithful covers of geometric datasets, and show that the simplicial complexes thus obtained can outperform standard topological inference approaches in terms of size, and Mapper-type algorithms in terms of representation of large-scale topology.
APA
Scoccola, L., Lim, U. & Harrington, H.A.. (2025). Cover learning for large-scale topology representation. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:53728-53756 Available from https://proceedings.mlr.press/v267/scoccola25a.html.

Related Material