A Divide and Conquer Framework for Distributed Graph Clustering
; Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:504-513, 2015.
Graph clustering is about identifying clusters of closely connected nodes, and is a fundamental technique of data analysis with many applications including community detection, VLSI network partitioning, collaborative filtering, and many others. In order to improve the scalability of existing graph clustering algorithms, we propose a novel divide and conquer framework for graph clustering, and establish theoretical guarantees of exact recovery of the clusters. One additional advantage of the proposed framework is that it can identify small clusters – the size of the smallest cluster can be of size o(\sqrtn), in contrast to Ω(\sqrtn) required by standard methods. Extensive experiments on synthetic and real-world datasets demonstrate the efficiency and effectiveness of our framework.