GSCAN: Graph Stability Clustering for Applications With Noise Using Edge-Aware Excess-of-Mass

Etzion Harari, Naphtali Abudarham, Roee Litman
Proceedings of the Second Learning on Graphs Conference, PMLR 231:9:1-9:15, 2024.

Abstract

Graph Clustering is required for the identification of communities and groups within a given network. In recent years, various attempts have been made to develop tools suitable for this purpose. Most recently, these attempts are based on the latest advancements in deep learning and especially in Graph Neural Networks (GNN). While some methods take into account the graph intrinsic topological structure throughout, surprisingly, the leading clustering methods ignore this during the final cluster assignment stage, which leads to sub-optimal results. In this paper, we propose GSCAN: a Graph Stability Clustering for Applications with Noise, which is based both on node features and on the graph structure. We base our approach on the celebrated method of Exess-of-Mass (EoM), which is based the principle of maximizing cluster stability. This method has additional desirable properties like resilience to outliers and the fact it doesn’t require an a-priory definition of the number of clusters. We extend EoM to work on the \ast intrinsic\ast graph structure and propose two possible post-processes to deal with one of EoM’s shortcomings - its tendency to over-flagging data-points as outliers. These post processes harness the graph topology and lead to superior performance, even compared to leading clustering approaches that are trained end-to-end. We show that the proposed approach can be implemented in a fast and scalable manner. Our claims are backed on three well-known benchmark datasets. Our code is available here: https://github.com/GraphEoM/GSCAN

Cite this Paper


BibTeX
@InProceedings{pmlr-v231-harari24a, title = {GSCAN: Graph Stability Clustering for Applications With Noise Using Edge-Aware Excess-of-Mass}, author = {Harari, Etzion and Abudarham, Naphtali and Litman, Roee}, booktitle = {Proceedings of the Second Learning on Graphs Conference}, pages = {9:1--9:15}, year = {2024}, editor = {Villar, Soledad and Chamberlain, Benjamin}, volume = {231}, series = {Proceedings of Machine Learning Research}, month = {27--30 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v231/harari24a/harari24a.pdf}, url = {https://proceedings.mlr.press/v231/harari24a.html}, abstract = {Graph Clustering is required for the identification of communities and groups within a given network. In recent years, various attempts have been made to develop tools suitable for this purpose. Most recently, these attempts are based on the latest advancements in deep learning and especially in Graph Neural Networks (GNN). While some methods take into account the graph intrinsic topological structure throughout, surprisingly, the leading clustering methods ignore this during the final cluster assignment stage, which leads to sub-optimal results. In this paper, we propose GSCAN: a Graph Stability Clustering for Applications with Noise, which is based both on node features and on the graph structure. We base our approach on the celebrated method of Exess-of-Mass (EoM), which is based the principle of maximizing cluster stability. This method has additional desirable properties like resilience to outliers and the fact it doesn’t require an a-priory definition of the number of clusters. We extend EoM to work on the \ast intrinsic\ast graph structure and propose two possible post-processes to deal with one of EoM’s shortcomings - its tendency to over-flagging data-points as outliers. These post processes harness the graph topology and lead to superior performance, even compared to leading clustering approaches that are trained end-to-end. We show that the proposed approach can be implemented in a fast and scalable manner. Our claims are backed on three well-known benchmark datasets. Our code is available here: https://github.com/GraphEoM/GSCAN} }
Endnote
%0 Conference Paper %T GSCAN: Graph Stability Clustering for Applications With Noise Using Edge-Aware Excess-of-Mass %A Etzion Harari %A Naphtali Abudarham %A Roee Litman %B Proceedings of the Second Learning on Graphs Conference %C Proceedings of Machine Learning Research %D 2024 %E Soledad Villar %E Benjamin Chamberlain %F pmlr-v231-harari24a %I PMLR %P 9:1--9:15 %U https://proceedings.mlr.press/v231/harari24a.html %V 231 %X Graph Clustering is required for the identification of communities and groups within a given network. In recent years, various attempts have been made to develop tools suitable for this purpose. Most recently, these attempts are based on the latest advancements in deep learning and especially in Graph Neural Networks (GNN). While some methods take into account the graph intrinsic topological structure throughout, surprisingly, the leading clustering methods ignore this during the final cluster assignment stage, which leads to sub-optimal results. In this paper, we propose GSCAN: a Graph Stability Clustering for Applications with Noise, which is based both on node features and on the graph structure. We base our approach on the celebrated method of Exess-of-Mass (EoM), which is based the principle of maximizing cluster stability. This method has additional desirable properties like resilience to outliers and the fact it doesn’t require an a-priory definition of the number of clusters. We extend EoM to work on the \ast intrinsic\ast graph structure and propose two possible post-processes to deal with one of EoM’s shortcomings - its tendency to over-flagging data-points as outliers. These post processes harness the graph topology and lead to superior performance, even compared to leading clustering approaches that are trained end-to-end. We show that the proposed approach can be implemented in a fast and scalable manner. Our claims are backed on three well-known benchmark datasets. Our code is available here: https://github.com/GraphEoM/GSCAN
APA
Harari, E., Abudarham, N. & Litman, R.. (2024). GSCAN: Graph Stability Clustering for Applications With Noise Using Edge-Aware Excess-of-Mass. Proceedings of the Second Learning on Graphs Conference, in Proceedings of Machine Learning Research 231:9:1-9:15 Available from https://proceedings.mlr.press/v231/harari24a.html.

Related Material