Representation Topology Divergence: A Method for Comparing Neural Network Representations.

Serguei Barannikov, Ilya Trofimov, Nikita Balabin, Evgeny Burnaev
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:1607-1626, 2022.

Abstract

Comparison of data representations is a complex multi-aspect problem. We propose a method for comparing two data representations. We introduce the Representation Topology Divergence (RTD) score measuring the dissimilarity in multi-scale topology between two point clouds of equal size with a one-to-one correspondence between points. The two data point clouds can lie in different ambient spaces. The RTD score is one of the few topological data analysis based practical methods applicable to real machine learning datasets. Experiments show the agreement of RTD with the intuitive assessment of data representation similarity. The proposed RTD score is sensitive to the data representation’s fine topological structure. We use the RTD score to gain insights on neural networks representations in computer vision and NLP domains for various problems: training dynamics analysis, data distribution shift, transfer learning, ensemble learning, disentanglement assessment.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-barannikov22a, title = {Representation Topology Divergence: A Method for Comparing Neural Network Representations.}, author = {Barannikov, Serguei and Trofimov, Ilya and Balabin, Nikita and Burnaev, Evgeny}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {1607--1626}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/barannikov22a/barannikov22a.pdf}, url = {https://proceedings.mlr.press/v162/barannikov22a.html}, abstract = {Comparison of data representations is a complex multi-aspect problem. We propose a method for comparing two data representations. We introduce the Representation Topology Divergence (RTD) score measuring the dissimilarity in multi-scale topology between two point clouds of equal size with a one-to-one correspondence between points. The two data point clouds can lie in different ambient spaces. The RTD score is one of the few topological data analysis based practical methods applicable to real machine learning datasets. Experiments show the agreement of RTD with the intuitive assessment of data representation similarity. The proposed RTD score is sensitive to the data representation’s fine topological structure. We use the RTD score to gain insights on neural networks representations in computer vision and NLP domains for various problems: training dynamics analysis, data distribution shift, transfer learning, ensemble learning, disentanglement assessment.} }
Endnote
%0 Conference Paper %T Representation Topology Divergence: A Method for Comparing Neural Network Representations. %A Serguei Barannikov %A Ilya Trofimov %A Nikita Balabin %A Evgeny Burnaev %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-barannikov22a %I PMLR %P 1607--1626 %U https://proceedings.mlr.press/v162/barannikov22a.html %V 162 %X Comparison of data representations is a complex multi-aspect problem. We propose a method for comparing two data representations. We introduce the Representation Topology Divergence (RTD) score measuring the dissimilarity in multi-scale topology between two point clouds of equal size with a one-to-one correspondence between points. The two data point clouds can lie in different ambient spaces. The RTD score is one of the few topological data analysis based practical methods applicable to real machine learning datasets. Experiments show the agreement of RTD with the intuitive assessment of data representation similarity. The proposed RTD score is sensitive to the data representation’s fine topological structure. We use the RTD score to gain insights on neural networks representations in computer vision and NLP domains for various problems: training dynamics analysis, data distribution shift, transfer learning, ensemble learning, disentanglement assessment.
APA
Barannikov, S., Trofimov, I., Balabin, N. & Burnaev, E.. (2022). Representation Topology Divergence: A Method for Comparing Neural Network Representations.. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:1607-1626 Available from https://proceedings.mlr.press/v162/barannikov22a.html.

Related Material