Universal Representation of Permutation-Invariant Functions on Vectors and Tensors

Puoya Tabaghi, Yusu Wang
Proceedings of The 35th International Conference on Algorithmic Learning Theory, PMLR 237:1134-1187, 2024.

Abstract

A main object of our study is multiset functions — that is, permutation-invariant functions over inputs of varying sizes. Deep Sets, proposed by Zaheer et al. (2017), provides a universal representation for continuous multiset functions on scalars via a sum-decomposable model. Restricting the domain of the functions to finite multisets of $D$-dimensional vectors, Deep Sets also provides a universal approximation that requires a latent space dimension of $O(N^D)$ — where $N$ is an upper bound on the size of input multisets. In this paper, we strengthen this result by proving that universal representation is guaranteed for continuous and discontinuous multiset functions through a latent space dimension of $O(N^D)$ (which we will further improve upon). We then introduce identifiable multisets for which we can uniquely label their elements using an identifier function, namely, finite-precision vectors are identifiable. Based on our analysis of identifiable multisets, we prove that a sum-decomposable model, for general continuous multiset functions requires only a latent dimension of $2DN$, as opposed to $O(N^D)$. We further show that both encoder and decoder functions of the model are continuous — our main contribution to the existing work which lacks such a guarantee. Additionally, this provides a significant improvement over the aforementioned $O(N^D)$ bound, derived for the universal representation of both continuous and discontinuous multiset functions. We then extend our results and provide special sum-decomposition structures to universally represent permutation-invariant tensor functions on identifiable tensors. These families of sum-decomposition models enable us to design deep network architectures and deploy them on a variety of learning tasks on sequences, images, and graphs.

Cite this Paper


BibTeX
@InProceedings{pmlr-v237-tabaghi24a, title = {Universal Representation of Permutation-Invariant Functions on Vectors and Tensors}, author = {Tabaghi, Puoya and Wang, Yusu}, booktitle = {Proceedings of The 35th International Conference on Algorithmic Learning Theory}, pages = {1134--1187}, year = {2024}, editor = {Vernade, Claire and Hsu, Daniel}, volume = {237}, series = {Proceedings of Machine Learning Research}, month = {25--28 Feb}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v237/tabaghi24a/tabaghi24a.pdf}, url = {https://proceedings.mlr.press/v237/tabaghi24a.html}, abstract = {A main object of our study is multiset functions — that is, permutation-invariant functions over inputs of varying sizes. Deep Sets, proposed by Zaheer et al. (2017), provides a universal representation for continuous multiset functions on scalars via a sum-decomposable model. Restricting the domain of the functions to finite multisets of $D$-dimensional vectors, Deep Sets also provides a universal approximation that requires a latent space dimension of $O(N^D)$ — where $N$ is an upper bound on the size of input multisets. In this paper, we strengthen this result by proving that universal representation is guaranteed for continuous and discontinuous multiset functions through a latent space dimension of $O(N^D)$ (which we will further improve upon). We then introduce identifiable multisets for which we can uniquely label their elements using an identifier function, namely, finite-precision vectors are identifiable. Based on our analysis of identifiable multisets, we prove that a sum-decomposable model, for general continuous multiset functions requires only a latent dimension of $2DN$, as opposed to $O(N^D)$. We further show that both encoder and decoder functions of the model are continuous — our main contribution to the existing work which lacks such a guarantee. Additionally, this provides a significant improvement over the aforementioned $O(N^D)$ bound, derived for the universal representation of both continuous and discontinuous multiset functions. We then extend our results and provide special sum-decomposition structures to universally represent permutation-invariant tensor functions on identifiable tensors. These families of sum-decomposition models enable us to design deep network architectures and deploy them on a variety of learning tasks on sequences, images, and graphs.} }
Endnote
%0 Conference Paper %T Universal Representation of Permutation-Invariant Functions on Vectors and Tensors %A Puoya Tabaghi %A Yusu Wang %B Proceedings of The 35th International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2024 %E Claire Vernade %E Daniel Hsu %F pmlr-v237-tabaghi24a %I PMLR %P 1134--1187 %U https://proceedings.mlr.press/v237/tabaghi24a.html %V 237 %X A main object of our study is multiset functions — that is, permutation-invariant functions over inputs of varying sizes. Deep Sets, proposed by Zaheer et al. (2017), provides a universal representation for continuous multiset functions on scalars via a sum-decomposable model. Restricting the domain of the functions to finite multisets of $D$-dimensional vectors, Deep Sets also provides a universal approximation that requires a latent space dimension of $O(N^D)$ — where $N$ is an upper bound on the size of input multisets. In this paper, we strengthen this result by proving that universal representation is guaranteed for continuous and discontinuous multiset functions through a latent space dimension of $O(N^D)$ (which we will further improve upon). We then introduce identifiable multisets for which we can uniquely label their elements using an identifier function, namely, finite-precision vectors are identifiable. Based on our analysis of identifiable multisets, we prove that a sum-decomposable model, for general continuous multiset functions requires only a latent dimension of $2DN$, as opposed to $O(N^D)$. We further show that both encoder and decoder functions of the model are continuous — our main contribution to the existing work which lacks such a guarantee. Additionally, this provides a significant improvement over the aforementioned $O(N^D)$ bound, derived for the universal representation of both continuous and discontinuous multiset functions. We then extend our results and provide special sum-decomposition structures to universally represent permutation-invariant tensor functions on identifiable tensors. These families of sum-decomposition models enable us to design deep network architectures and deploy them on a variety of learning tasks on sequences, images, and graphs.
APA
Tabaghi, P. & Wang, Y.. (2024). Universal Representation of Permutation-Invariant Functions on Vectors and Tensors. Proceedings of The 35th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 237:1134-1187 Available from https://proceedings.mlr.press/v237/tabaghi24a.html.

Related Material