A Divide and Conquer Framework for Distributed Graph Clustering

Wenzhuo Yang, Huan Xu
; Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:504-513, 2015.

Abstract

Graph clustering is about identifying clusters of closely connected nodes, and is a fundamental technique of data analysis with many applications including community detection, VLSI network partitioning, collaborative filtering, and many others. In order to improve the scalability of existing graph clustering algorithms, we propose a novel divide and conquer framework for graph clustering, and establish theoretical guarantees of exact recovery of the clusters. One additional advantage of the proposed framework is that it can identify small clusters – the size of the smallest cluster can be of size o(\sqrtn), in contrast to Ω(\sqrtn) required by standard methods. Extensive experiments on synthetic and real-world datasets demonstrate the efficiency and effectiveness of our framework.

Cite this Paper


BibTeX
@InProceedings{pmlr-v37-yange15, title = {A Divide and Conquer Framework for Distributed Graph Clustering}, author = {Wenzhuo Yang and Huan Xu}, booktitle = {Proceedings of the 32nd International Conference on Machine Learning}, pages = {504--513}, year = {2015}, editor = {Francis Bach and David Blei}, volume = {37}, series = {Proceedings of Machine Learning Research}, address = {Lille, France}, month = {07--09 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v37/yange15.pdf}, url = {http://proceedings.mlr.press/v37/yange15.html}, abstract = {Graph clustering is about identifying clusters of closely connected nodes, and is a fundamental technique of data analysis with many applications including community detection, VLSI network partitioning, collaborative filtering, and many others. In order to improve the scalability of existing graph clustering algorithms, we propose a novel divide and conquer framework for graph clustering, and establish theoretical guarantees of exact recovery of the clusters. One additional advantage of the proposed framework is that it can identify small clusters – the size of the smallest cluster can be of size o(\sqrtn), in contrast to Ω(\sqrtn) required by standard methods. Extensive experiments on synthetic and real-world datasets demonstrate the efficiency and effectiveness of our framework.} }
Endnote
%0 Conference Paper %T A Divide and Conquer Framework for Distributed Graph Clustering %A Wenzhuo Yang %A Huan Xu %B Proceedings of the 32nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2015 %E Francis Bach %E David Blei %F pmlr-v37-yange15 %I PMLR %J Proceedings of Machine Learning Research %P 504--513 %U http://proceedings.mlr.press %V 37 %W PMLR %X Graph clustering is about identifying clusters of closely connected nodes, and is a fundamental technique of data analysis with many applications including community detection, VLSI network partitioning, collaborative filtering, and many others. In order to improve the scalability of existing graph clustering algorithms, we propose a novel divide and conquer framework for graph clustering, and establish theoretical guarantees of exact recovery of the clusters. One additional advantage of the proposed framework is that it can identify small clusters – the size of the smallest cluster can be of size o(\sqrtn), in contrast to Ω(\sqrtn) required by standard methods. Extensive experiments on synthetic and real-world datasets demonstrate the efficiency and effectiveness of our framework.
RIS
TY - CPAPER TI - A Divide and Conquer Framework for Distributed Graph Clustering AU - Wenzhuo Yang AU - Huan Xu BT - Proceedings of the 32nd International Conference on Machine Learning PY - 2015/06/01 DA - 2015/06/01 ED - Francis Bach ED - David Blei ID - pmlr-v37-yange15 PB - PMLR SP - 504 DP - PMLR EP - 513 L1 - http://proceedings.mlr.press/v37/yange15.pdf UR - http://proceedings.mlr.press/v37/yange15.html AB - Graph clustering is about identifying clusters of closely connected nodes, and is a fundamental technique of data analysis with many applications including community detection, VLSI network partitioning, collaborative filtering, and many others. In order to improve the scalability of existing graph clustering algorithms, we propose a novel divide and conquer framework for graph clustering, and establish theoretical guarantees of exact recovery of the clusters. One additional advantage of the proposed framework is that it can identify small clusters – the size of the smallest cluster can be of size o(\sqrtn), in contrast to Ω(\sqrtn) required by standard methods. Extensive experiments on synthetic and real-world datasets demonstrate the efficiency and effectiveness of our framework. ER -
APA
Yang, W. & Xu, H.. (2015). A Divide and Conquer Framework for Distributed Graph Clustering. Proceedings of the 32nd International Conference on Machine Learning, in PMLR 37:504-513

Related Material