GC-Flow: A Graph-Based Flow Network for Effective Clustering

Tianchun Wang, Farzaneh Mirzazadeh, Xiang Zhang, Jie Chen
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:36157-36173, 2023.

Abstract

Graph convolutional networks (GCNs) are discriminative models that directly model the class posterior $p(y|\mathbf{x})$ for semi-supervised classification of graph data. While being effective, as a representation learning approach, the node representations extracted from a GCN often miss useful information for effective clustering, because the objectives are different. In this work, we design normalizing flows that replace GCN layers, leading to a generative model that models both the class conditional likelihood $p(\mathbf{x}|y)$ and the class prior $p(y)$. The resulting neural network, GC-Flow, retains the graph convolution operations while being equipped with a Gaussian mixture representation space. It enjoys two benefits: it not only maintains the predictive power of GCN, but also produces well-separated clusters, due to the structuring of the representation space. We demonstrate these benefits on a variety of benchmark data sets. Moreover, we show that additional parameterization, such as that on the adjacency matrix used for graph convolutions, yields additional improvement in clustering.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-wang23y, title = {{GC}-Flow: A Graph-Based Flow Network for Effective Clustering}, author = {Wang, Tianchun and Mirzazadeh, Farzaneh and Zhang, Xiang and Chen, Jie}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {36157--36173}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/wang23y/wang23y.pdf}, url = {https://proceedings.mlr.press/v202/wang23y.html}, abstract = {Graph convolutional networks (GCNs) are discriminative models that directly model the class posterior $p(y|\mathbf{x})$ for semi-supervised classification of graph data. While being effective, as a representation learning approach, the node representations extracted from a GCN often miss useful information for effective clustering, because the objectives are different. In this work, we design normalizing flows that replace GCN layers, leading to a generative model that models both the class conditional likelihood $p(\mathbf{x}|y)$ and the class prior $p(y)$. The resulting neural network, GC-Flow, retains the graph convolution operations while being equipped with a Gaussian mixture representation space. It enjoys two benefits: it not only maintains the predictive power of GCN, but also produces well-separated clusters, due to the structuring of the representation space. We demonstrate these benefits on a variety of benchmark data sets. Moreover, we show that additional parameterization, such as that on the adjacency matrix used for graph convolutions, yields additional improvement in clustering.} }
Endnote
%0 Conference Paper %T GC-Flow: A Graph-Based Flow Network for Effective Clustering %A Tianchun Wang %A Farzaneh Mirzazadeh %A Xiang Zhang %A Jie Chen %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-wang23y %I PMLR %P 36157--36173 %U https://proceedings.mlr.press/v202/wang23y.html %V 202 %X Graph convolutional networks (GCNs) are discriminative models that directly model the class posterior $p(y|\mathbf{x})$ for semi-supervised classification of graph data. While being effective, as a representation learning approach, the node representations extracted from a GCN often miss useful information for effective clustering, because the objectives are different. In this work, we design normalizing flows that replace GCN layers, leading to a generative model that models both the class conditional likelihood $p(\mathbf{x}|y)$ and the class prior $p(y)$. The resulting neural network, GC-Flow, retains the graph convolution operations while being equipped with a Gaussian mixture representation space. It enjoys two benefits: it not only maintains the predictive power of GCN, but also produces well-separated clusters, due to the structuring of the representation space. We demonstrate these benefits on a variety of benchmark data sets. Moreover, we show that additional parameterization, such as that on the adjacency matrix used for graph convolutions, yields additional improvement in clustering.
APA
Wang, T., Mirzazadeh, F., Zhang, X. & Chen, J.. (2023). GC-Flow: A Graph-Based Flow Network for Effective Clustering. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:36157-36173 Available from https://proceedings.mlr.press/v202/wang23y.html.

Related Material