The Effective Number of Shared Dimensions Between Paired Datasets

Hamza Giaffar, Camille Rullán Buxó, Mikio Aoi
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:4249-4257, 2024.

Abstract

A number of recent studies have sought to understand the behavior of both artificial and biological neural networks by comparing representations across layers, networks and brain areas. Increasingly prevalent, too, are comparisons across modalities of data, such as neural network activations and training data or behavioral data and neurophysiological recordings. One approach to such comparisons involves measuring the dimensionality of the space shared between the paired data matrices, where dimensionality serves as a proxy for computational or representational complexity. Established approaches, including CCA, can be used to measure the number of shared embedding dimensions, however they do not account for potentially unequal variance along shared dimensions and so cannot measure effective shared dimensionality. We present a candidate measure for shared dimensionality that we call the effective number of shared dimensions (ENSD). The ENSD is an interpretable and computationally efficient model-free measure of shared dimensionality that can be used to probe shared structure in a wide variety of data types. We demonstrate the relative robustness of the ENSD in cases where data is sparse or low rank and illustrate how the ENSD can be applied in a variety of analyses of representational similarities across layers in convolutional neural networks and between brain regions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-giaffar24a, title = { The Effective Number of Shared Dimensions Between Paired Datasets }, author = {Giaffar, Hamza and Rull\'{a}n Bux\'{o}, Camille and Aoi, Mikio}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {4249--4257}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/giaffar24a/giaffar24a.pdf}, url = {https://proceedings.mlr.press/v238/giaffar24a.html}, abstract = { A number of recent studies have sought to understand the behavior of both artificial and biological neural networks by comparing representations across layers, networks and brain areas. Increasingly prevalent, too, are comparisons across modalities of data, such as neural network activations and training data or behavioral data and neurophysiological recordings. One approach to such comparisons involves measuring the dimensionality of the space shared between the paired data matrices, where dimensionality serves as a proxy for computational or representational complexity. Established approaches, including CCA, can be used to measure the number of shared embedding dimensions, however they do not account for potentially unequal variance along shared dimensions and so cannot measure effective shared dimensionality. We present a candidate measure for shared dimensionality that we call the effective number of shared dimensions (ENSD). The ENSD is an interpretable and computationally efficient model-free measure of shared dimensionality that can be used to probe shared structure in a wide variety of data types. We demonstrate the relative robustness of the ENSD in cases where data is sparse or low rank and illustrate how the ENSD can be applied in a variety of analyses of representational similarities across layers in convolutional neural networks and between brain regions. } }
Endnote
%0 Conference Paper %T The Effective Number of Shared Dimensions Between Paired Datasets %A Hamza Giaffar %A Camille Rullán Buxó %A Mikio Aoi %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-giaffar24a %I PMLR %P 4249--4257 %U https://proceedings.mlr.press/v238/giaffar24a.html %V 238 %X A number of recent studies have sought to understand the behavior of both artificial and biological neural networks by comparing representations across layers, networks and brain areas. Increasingly prevalent, too, are comparisons across modalities of data, such as neural network activations and training data or behavioral data and neurophysiological recordings. One approach to such comparisons involves measuring the dimensionality of the space shared between the paired data matrices, where dimensionality serves as a proxy for computational or representational complexity. Established approaches, including CCA, can be used to measure the number of shared embedding dimensions, however they do not account for potentially unequal variance along shared dimensions and so cannot measure effective shared dimensionality. We present a candidate measure for shared dimensionality that we call the effective number of shared dimensions (ENSD). The ENSD is an interpretable and computationally efficient model-free measure of shared dimensionality that can be used to probe shared structure in a wide variety of data types. We demonstrate the relative robustness of the ENSD in cases where data is sparse or low rank and illustrate how the ENSD can be applied in a variety of analyses of representational similarities across layers in convolutional neural networks and between brain regions.
APA
Giaffar, H., Rullán Buxó, C. & Aoi, M.. (2024). The Effective Number of Shared Dimensions Between Paired Datasets . Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:4249-4257 Available from https://proceedings.mlr.press/v238/giaffar24a.html.

Related Material