Detecting Overlapping and Correlated Communities without Pure Nodes: Identifiability and Algorithm

Kejun Huang, Xiao Fu
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:2859-2868, 2019.

Abstract

Many machine learning problems come in the form of networks with relational data between entities, and one of the key unsupervised learning tasks is to detect communities in such a network. We adopt the mixed-membership stochastic blockmodel as the underlying probabilistic model, and give conditions under which the memberships of a subset of nodes can be uniquely identified. Our method starts by constructing a second-order graph moment, which can be shown to converge to a specific product of the true parameters as the size of the network increases. To correctly recover the true membership parameters, we formulate an optimization problem using insights from convex geometry. We show that if the true memberships satisfy a so-called sufficiently scattered condition, then solving the proposed problem correctly identifies the ground truth. We also propose an efficient algorithm for detecting communities, which is significantly faster than prior work and with better convergence properties. Experiments on synthetic and real data justify the validity of the proposed learning framework for network data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-huang19c, title = {Detecting Overlapping and Correlated Communities without Pure Nodes: Identifiability and Algorithm}, author = {Huang, Kejun and Fu, Xiao}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {2859--2868}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/huang19c/huang19c.pdf}, url = {https://proceedings.mlr.press/v97/huang19c.html}, abstract = {Many machine learning problems come in the form of networks with relational data between entities, and one of the key unsupervised learning tasks is to detect communities in such a network. We adopt the mixed-membership stochastic blockmodel as the underlying probabilistic model, and give conditions under which the memberships of a subset of nodes can be uniquely identified. Our method starts by constructing a second-order graph moment, which can be shown to converge to a specific product of the true parameters as the size of the network increases. To correctly recover the true membership parameters, we formulate an optimization problem using insights from convex geometry. We show that if the true memberships satisfy a so-called sufficiently scattered condition, then solving the proposed problem correctly identifies the ground truth. We also propose an efficient algorithm for detecting communities, which is significantly faster than prior work and with better convergence properties. Experiments on synthetic and real data justify the validity of the proposed learning framework for network data.} }
Endnote
%0 Conference Paper %T Detecting Overlapping and Correlated Communities without Pure Nodes: Identifiability and Algorithm %A Kejun Huang %A Xiao Fu %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-huang19c %I PMLR %P 2859--2868 %U https://proceedings.mlr.press/v97/huang19c.html %V 97 %X Many machine learning problems come in the form of networks with relational data between entities, and one of the key unsupervised learning tasks is to detect communities in such a network. We adopt the mixed-membership stochastic blockmodel as the underlying probabilistic model, and give conditions under which the memberships of a subset of nodes can be uniquely identified. Our method starts by constructing a second-order graph moment, which can be shown to converge to a specific product of the true parameters as the size of the network increases. To correctly recover the true membership parameters, we formulate an optimization problem using insights from convex geometry. We show that if the true memberships satisfy a so-called sufficiently scattered condition, then solving the proposed problem correctly identifies the ground truth. We also propose an efficient algorithm for detecting communities, which is significantly faster than prior work and with better convergence properties. Experiments on synthetic and real data justify the validity of the proposed learning framework for network data.
APA
Huang, K. & Fu, X.. (2019). Detecting Overlapping and Correlated Communities without Pure Nodes: Identifiability and Algorithm. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:2859-2868 Available from https://proceedings.mlr.press/v97/huang19c.html.

Related Material