Network Tight Community Detection

Jiayi Deng, Xiaodong Yang, Jun Yu, Jun Liu, Zhaiming Shen, Danyang Huang, Huimin Cheng
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:10574-10596, 2024.

Abstract

Conventional community detection methods often categorize all nodes into clusters. However, the presumed community structure of interest may only be valid for a subset of nodes (named as ‘tight nodes’), while the rest of the network may consist of noninformative “scattered nodes”. For example, a protein-protein network often contains proteins that do not belong to specific biological functional modules but are involved in more general processes, or act as bridges between different functional modules. Forcing each of these proteins into a single cluster introduces unwanted biases and obscures the underlying biological implication. To address this issue, we propose a tight community detection (TCD) method to identify tight communities excluding scattered nodes. The algorithm enjoys a strong theoretical guarantee of tight node identification accuracy and is scalable for large networks. The superiority of the proposed method is demonstrated by various synthetic and real experiments.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-deng24f, title = {Network Tight Community Detection}, author = {Deng, Jiayi and Yang, Xiaodong and Yu, Jun and Liu, Jun and Shen, Zhaiming and Huang, Danyang and Cheng, Huimin}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {10574--10596}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/deng24f/deng24f.pdf}, url = {https://proceedings.mlr.press/v235/deng24f.html}, abstract = {Conventional community detection methods often categorize all nodes into clusters. However, the presumed community structure of interest may only be valid for a subset of nodes (named as ‘tight nodes’), while the rest of the network may consist of noninformative “scattered nodes”. For example, a protein-protein network often contains proteins that do not belong to specific biological functional modules but are involved in more general processes, or act as bridges between different functional modules. Forcing each of these proteins into a single cluster introduces unwanted biases and obscures the underlying biological implication. To address this issue, we propose a tight community detection (TCD) method to identify tight communities excluding scattered nodes. The algorithm enjoys a strong theoretical guarantee of tight node identification accuracy and is scalable for large networks. The superiority of the proposed method is demonstrated by various synthetic and real experiments.} }
Endnote
%0 Conference Paper %T Network Tight Community Detection %A Jiayi Deng %A Xiaodong Yang %A Jun Yu %A Jun Liu %A Zhaiming Shen %A Danyang Huang %A Huimin Cheng %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-deng24f %I PMLR %P 10574--10596 %U https://proceedings.mlr.press/v235/deng24f.html %V 235 %X Conventional community detection methods often categorize all nodes into clusters. However, the presumed community structure of interest may only be valid for a subset of nodes (named as ‘tight nodes’), while the rest of the network may consist of noninformative “scattered nodes”. For example, a protein-protein network often contains proteins that do not belong to specific biological functional modules but are involved in more general processes, or act as bridges between different functional modules. Forcing each of these proteins into a single cluster introduces unwanted biases and obscures the underlying biological implication. To address this issue, we propose a tight community detection (TCD) method to identify tight communities excluding scattered nodes. The algorithm enjoys a strong theoretical guarantee of tight node identification accuracy and is scalable for large networks. The superiority of the proposed method is demonstrated by various synthetic and real experiments.
APA
Deng, J., Yang, X., Yu, J., Liu, J., Shen, Z., Huang, D. & Cheng, H.. (2024). Network Tight Community Detection. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:10574-10596 Available from https://proceedings.mlr.press/v235/deng24f.html.

Related Material