DIGRAC: Digraph Clustering Based on Flow Imbalance

Yixuan He, Gesine Reinert, Mihai Cucuringu
Proceedings of the First Learning on Graphs Conference, PMLR 198:21:1-21:43, 2022.

Abstract

Node clustering is a powerful tool in the analysis of networks. We introduce a graph neural network framework, named DIGRAC, to obtain node embeddings for directed networks in a self-supervised manner, including a novel probabilistic imbalance loss, which can be used for network clustering. Here, we propose \textit{directed flow imbalance} measures, which are tightly related to directionality, to reveal clusters in the network even when there is no density difference between clusters. In contrast to standard approaches in the literature, in this paper, directionality is not treated as a nuisance, but rather contains the main signal. DIGRAC optimizes directed flow imbalance for clustering without requiring label supervision, unlike existing graph neural network methods, and can naturally incorporate node features, unlike existing spectral methods. Extensive experimental results on synthetic data, in the form of directed stochastic block models, and real-world data at different scales, demonstrate that our method, based on flow imbalance, attains state-of-the-art results on directed graph clustering when compared against 10 state-of-the-art methods from the literature, for a wide range of noise and sparsity levels, graph structures, and topologies, and even outperforms supervised methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v198-he22b, title = {DIGRAC: Digraph Clustering Based on Flow Imbalance}, author = {He, Yixuan and Reinert, Gesine and Cucuringu, Mihai}, booktitle = {Proceedings of the First Learning on Graphs Conference}, pages = {21:1--21:43}, year = {2022}, editor = {Rieck, Bastian and Pascanu, Razvan}, volume = {198}, series = {Proceedings of Machine Learning Research}, month = {09--12 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v198/he22b/he22b.pdf}, url = {https://proceedings.mlr.press/v198/he22b.html}, abstract = {Node clustering is a powerful tool in the analysis of networks. We introduce a graph neural network framework, named DIGRAC, to obtain node embeddings for directed networks in a self-supervised manner, including a novel probabilistic imbalance loss, which can be used for network clustering. Here, we propose \textit{directed flow imbalance} measures, which are tightly related to directionality, to reveal clusters in the network even when there is no density difference between clusters. In contrast to standard approaches in the literature, in this paper, directionality is not treated as a nuisance, but rather contains the main signal. DIGRAC optimizes directed flow imbalance for clustering without requiring label supervision, unlike existing graph neural network methods, and can naturally incorporate node features, unlike existing spectral methods. Extensive experimental results on synthetic data, in the form of directed stochastic block models, and real-world data at different scales, demonstrate that our method, based on flow imbalance, attains state-of-the-art results on directed graph clustering when compared against 10 state-of-the-art methods from the literature, for a wide range of noise and sparsity levels, graph structures, and topologies, and even outperforms supervised methods. } }
Endnote
%0 Conference Paper %T DIGRAC: Digraph Clustering Based on Flow Imbalance %A Yixuan He %A Gesine Reinert %A Mihai Cucuringu %B Proceedings of the First Learning on Graphs Conference %C Proceedings of Machine Learning Research %D 2022 %E Bastian Rieck %E Razvan Pascanu %F pmlr-v198-he22b %I PMLR %P 21:1--21:43 %U https://proceedings.mlr.press/v198/he22b.html %V 198 %X Node clustering is a powerful tool in the analysis of networks. We introduce a graph neural network framework, named DIGRAC, to obtain node embeddings for directed networks in a self-supervised manner, including a novel probabilistic imbalance loss, which can be used for network clustering. Here, we propose \textit{directed flow imbalance} measures, which are tightly related to directionality, to reveal clusters in the network even when there is no density difference between clusters. In contrast to standard approaches in the literature, in this paper, directionality is not treated as a nuisance, but rather contains the main signal. DIGRAC optimizes directed flow imbalance for clustering without requiring label supervision, unlike existing graph neural network methods, and can naturally incorporate node features, unlike existing spectral methods. Extensive experimental results on synthetic data, in the form of directed stochastic block models, and real-world data at different scales, demonstrate that our method, based on flow imbalance, attains state-of-the-art results on directed graph clustering when compared against 10 state-of-the-art methods from the literature, for a wide range of noise and sparsity levels, graph structures, and topologies, and even outperforms supervised methods.
APA
He, Y., Reinert, G. & Cucuringu, M.. (2022). DIGRAC: Digraph Clustering Based on Flow Imbalance. Proceedings of the First Learning on Graphs Conference, in Proceedings of Machine Learning Research 198:21:1-21:43 Available from https://proceedings.mlr.press/v198/he22b.html.

Related Material