Fuzzy c-means clustering in persistence diagram space for deep learning model selection

Thomas Davies, Jack Aspinall, Bryan Wilder, Tran-Thanh Long
Proceedings of the 1st NeurIPS Workshop on Symmetry and Geometry in Neural Representations, PMLR 197:137-157, 2023.

Abstract

Persistence diagrams concisely capture the structure of data, an ability that is increasingly being used in the nascent field of topological machine learning. We extend the ubiquitous Fuzzy c-Means (FCM) clustering algorithm to the space of persistence diagrams, enabling unsupervised learning in a topological setting. We give theoretical convergence guarantees that correspond to the Euclidean case and empirically demonstrate the capability of the clustering to capture topological information via the fuzzy RAND index. We present an application of our algorithm to a scenario that utilises both the topological and fuzzy nature of our algorithm: pre-trained model selection in deep learning. As pre-trained models can perform well on multiple tasks, selecting the best model is a naturally fuzzy problem; we show that fuzzy clustering persistence diagrams allows for unsupervised model selection using just the topology of their decision boundaries.

Cite this Paper


BibTeX
@InProceedings{pmlr-v197-davies23a, title = {Fuzzy c-means clustering in persistence diagram space for deep learning model selection}, author = {Davies, Thomas and Aspinall, Jack and Wilder, Bryan and Long, Tran-Thanh}, booktitle = {Proceedings of the 1st NeurIPS Workshop on Symmetry and Geometry in Neural Representations}, pages = {137--157}, year = {2023}, editor = {Sanborn, Sophia and Shewmake, Christian and Azeglio, Simone and Di Bernardo, Arianna and Miolane, Nina}, volume = {197}, series = {Proceedings of Machine Learning Research}, month = {03 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v197/davies23a/davies23a.pdf}, url = {https://proceedings.mlr.press/v197/davies23a.html}, abstract = {Persistence diagrams concisely capture the structure of data, an ability that is increasingly being used in the nascent field of topological machine learning. We extend the ubiquitous Fuzzy c-Means (FCM) clustering algorithm to the space of persistence diagrams, enabling unsupervised learning in a topological setting. We give theoretical convergence guarantees that correspond to the Euclidean case and empirically demonstrate the capability of the clustering to capture topological information via the fuzzy RAND index. We present an application of our algorithm to a scenario that utilises both the topological and fuzzy nature of our algorithm: pre-trained model selection in deep learning. As pre-trained models can perform well on multiple tasks, selecting the best model is a naturally fuzzy problem; we show that fuzzy clustering persistence diagrams allows for unsupervised model selection using just the topology of their decision boundaries.} }
Endnote
%0 Conference Paper %T Fuzzy c-means clustering in persistence diagram space for deep learning model selection %A Thomas Davies %A Jack Aspinall %A Bryan Wilder %A Tran-Thanh Long %B Proceedings of the 1st NeurIPS Workshop on Symmetry and Geometry in Neural Representations %C Proceedings of Machine Learning Research %D 2023 %E Sophia Sanborn %E Christian Shewmake %E Simone Azeglio %E Arianna Di Bernardo %E Nina Miolane %F pmlr-v197-davies23a %I PMLR %P 137--157 %U https://proceedings.mlr.press/v197/davies23a.html %V 197 %X Persistence diagrams concisely capture the structure of data, an ability that is increasingly being used in the nascent field of topological machine learning. We extend the ubiquitous Fuzzy c-Means (FCM) clustering algorithm to the space of persistence diagrams, enabling unsupervised learning in a topological setting. We give theoretical convergence guarantees that correspond to the Euclidean case and empirically demonstrate the capability of the clustering to capture topological information via the fuzzy RAND index. We present an application of our algorithm to a scenario that utilises both the topological and fuzzy nature of our algorithm: pre-trained model selection in deep learning. As pre-trained models can perform well on multiple tasks, selecting the best model is a naturally fuzzy problem; we show that fuzzy clustering persistence diagrams allows for unsupervised model selection using just the topology of their decision boundaries.
APA
Davies, T., Aspinall, J., Wilder, B. & Long, T.. (2023). Fuzzy c-means clustering in persistence diagram space for deep learning model selection. Proceedings of the 1st NeurIPS Workshop on Symmetry and Geometry in Neural Representations, in Proceedings of Machine Learning Research 197:137-157 Available from https://proceedings.mlr.press/v197/davies23a.html.

Related Material