Similarity-Based Link Prediction From Modular Compression of Network Flows

Christopher Blöcker, Jelena Smiljanić, Ingo Scholtes, Martin Rosvall
Proceedings of the First Learning on Graphs Conference, PMLR 198:52:1-52:18, 2022.

Abstract

Node similarity scores are a foundation for machine learning in graphs for clustering, node classification, anomaly detection, and link prediction with applications in biological systems, information networks, and recommender systems. Recent works on link prediction use vector space embeddings to calculate node similarities in undirected networks with good performance. Still, they have several disadvantages: limited interpretability, need for hyperparameter tuning, manual model fitting through dimensionality reduction, and poor performance from symmetric similarities in directed link prediction. We propose MapSim, an information-theoretic measure to assess node similarities based on modular compression of network flows. Unlike vector space embeddings, MapSim represents nodes in a discrete, non-metric space of communities and yields asymmetric similarities in an unsupervised fashion. We compare MapSim on a link prediction task to popular embedding-based algorithms across 47 networks and find that MapSim’s average performance across all networks is more than 7% higher than its closest competitor, outperforming all embedding methods in 11 of the 47 networks. Our method demonstrates the potential of compression-based approaches in graph representation learning, with promising applications in other graph learning tasks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v198-blocker22a, title = {Similarity-Based Link Prediction From Modular Compression of Network Flows}, author = {Bl{\"o}cker, Christopher and Smiljani{\' c}, Jelena and Scholtes, Ingo and Rosvall, Martin}, booktitle = {Proceedings of the First Learning on Graphs Conference}, pages = {52:1--52:18}, year = {2022}, editor = {Rieck, Bastian and Pascanu, Razvan}, volume = {198}, series = {Proceedings of Machine Learning Research}, month = {09--12 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v198/blocker22a/blocker22a.pdf}, url = {https://proceedings.mlr.press/v198/blocker22a.html}, abstract = {Node similarity scores are a foundation for machine learning in graphs for clustering, node classification, anomaly detection, and link prediction with applications in biological systems, information networks, and recommender systems. Recent works on link prediction use vector space embeddings to calculate node similarities in undirected networks with good performance. Still, they have several disadvantages: limited interpretability, need for hyperparameter tuning, manual model fitting through dimensionality reduction, and poor performance from symmetric similarities in directed link prediction. We propose MapSim, an information-theoretic measure to assess node similarities based on modular compression of network flows. Unlike vector space embeddings, MapSim represents nodes in a discrete, non-metric space of communities and yields asymmetric similarities in an unsupervised fashion. We compare MapSim on a link prediction task to popular embedding-based algorithms across 47 networks and find that MapSim’s average performance across all networks is more than 7% higher than its closest competitor, outperforming all embedding methods in 11 of the 47 networks. Our method demonstrates the potential of compression-based approaches in graph representation learning, with promising applications in other graph learning tasks.} }
Endnote
%0 Conference Paper %T Similarity-Based Link Prediction From Modular Compression of Network Flows %A Christopher Blöcker %A Jelena Smiljanić %A Ingo Scholtes %A Martin Rosvall %B Proceedings of the First Learning on Graphs Conference %C Proceedings of Machine Learning Research %D 2022 %E Bastian Rieck %E Razvan Pascanu %F pmlr-v198-blocker22a %I PMLR %P 52:1--52:18 %U https://proceedings.mlr.press/v198/blocker22a.html %V 198 %X Node similarity scores are a foundation for machine learning in graphs for clustering, node classification, anomaly detection, and link prediction with applications in biological systems, information networks, and recommender systems. Recent works on link prediction use vector space embeddings to calculate node similarities in undirected networks with good performance. Still, they have several disadvantages: limited interpretability, need for hyperparameter tuning, manual model fitting through dimensionality reduction, and poor performance from symmetric similarities in directed link prediction. We propose MapSim, an information-theoretic measure to assess node similarities based on modular compression of network flows. Unlike vector space embeddings, MapSim represents nodes in a discrete, non-metric space of communities and yields asymmetric similarities in an unsupervised fashion. We compare MapSim on a link prediction task to popular embedding-based algorithms across 47 networks and find that MapSim’s average performance across all networks is more than 7% higher than its closest competitor, outperforming all embedding methods in 11 of the 47 networks. Our method demonstrates the potential of compression-based approaches in graph representation learning, with promising applications in other graph learning tasks.
APA
Blöcker, C., Smiljanić, J., Scholtes, I. & Rosvall, M.. (2022). Similarity-Based Link Prediction From Modular Compression of Network Flows. Proceedings of the First Learning on Graphs Conference, in Proceedings of Machine Learning Research 198:52:1-52:18 Available from https://proceedings.mlr.press/v198/blocker22a.html.

Related Material