Neighborhood Growth Determines Geometric Priors for Relational Representation Learning

Melanie Weber
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:266-276, 2020.

Abstract

The problem of identifying geometric structure in heterogeneous, high-dimensional data is a cornerstone of representation learning. While there exists a large body of literature on the embeddability of canonical graphs, such as lattices or trees, the heterogeneity of the relational data typically encountered in practice limits the applicability of these classical methods. In this paper, we propose a combinatorial approach to evaluating embeddability, i.e., to decide whether a data set is best represented in Euclidean, Hyperbolic or Spherical space. Our method analyzes nearest-neighbor structures and local neighborhood growth rates to identify the geometric priors of suitable embedding spaces. For canonical graphs, the algorithm’s prediction provably matches classical results. As for large, heterogeneous graphs, we introduce an efficiently computable statistic that approximates the algorithm’s decision rule. We validate our method over a range of benchmark data sets and compare with recently published optimization-based embeddability methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v108-weber20a, title = {Neighborhood Growth Determines Geometric Priors for Relational Representation Learning}, author = {Weber, Melanie}, booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, pages = {266--276}, year = {2020}, editor = {Chiappa, Silvia and Calandra, Roberto}, volume = {108}, series = {Proceedings of Machine Learning Research}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/weber20a/weber20a.pdf}, url = {https://proceedings.mlr.press/v108/weber20a.html}, abstract = {The problem of identifying geometric structure in heterogeneous, high-dimensional data is a cornerstone of representation learning. While there exists a large body of literature on the embeddability of canonical graphs, such as lattices or trees, the heterogeneity of the relational data typically encountered in practice limits the applicability of these classical methods. In this paper, we propose a combinatorial approach to evaluating embeddability, i.e., to decide whether a data set is best represented in Euclidean, Hyperbolic or Spherical space. Our method analyzes nearest-neighbor structures and local neighborhood growth rates to identify the geometric priors of suitable embedding spaces. For canonical graphs, the algorithm’s prediction provably matches classical results. As for large, heterogeneous graphs, we introduce an efficiently computable statistic that approximates the algorithm’s decision rule. We validate our method over a range of benchmark data sets and compare with recently published optimization-based embeddability methods.} }
Endnote
%0 Conference Paper %T Neighborhood Growth Determines Geometric Priors for Relational Representation Learning %A Melanie Weber %B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2020 %E Silvia Chiappa %E Roberto Calandra %F pmlr-v108-weber20a %I PMLR %P 266--276 %U https://proceedings.mlr.press/v108/weber20a.html %V 108 %X The problem of identifying geometric structure in heterogeneous, high-dimensional data is a cornerstone of representation learning. While there exists a large body of literature on the embeddability of canonical graphs, such as lattices or trees, the heterogeneity of the relational data typically encountered in practice limits the applicability of these classical methods. In this paper, we propose a combinatorial approach to evaluating embeddability, i.e., to decide whether a data set is best represented in Euclidean, Hyperbolic or Spherical space. Our method analyzes nearest-neighbor structures and local neighborhood growth rates to identify the geometric priors of suitable embedding spaces. For canonical graphs, the algorithm’s prediction provably matches classical results. As for large, heterogeneous graphs, we introduce an efficiently computable statistic that approximates the algorithm’s decision rule. We validate our method over a range of benchmark data sets and compare with recently published optimization-based embeddability methods.
APA
Weber, M.. (2020). Neighborhood Growth Determines Geometric Priors for Relational Representation Learning. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:266-276 Available from https://proceedings.mlr.press/v108/weber20a.html.

Related Material