Community Detection in Hypergraphs: Optimal Statistical Limit and Efficient Algorithms

I Chien, Chung-Yi Lin, I-Hsiang Wang
Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:871-879, 2018.

Abstract

In this paper, community detection in hypergraphs is explored. Under a generative hypergraph model called "d-wise hypergraph stochastic block model" (d-hSBM) which naturally extends the Stochastic Block Model (SBM) from graphs to d-uniform hypergraphs, the fundamental limit on the asymptotic minimax misclassified ratio is characterized. For proving the achievability, we propose a two-step polynomial time algorithm that provably achieves the fundamental limit in the sparse hypergraph regime. For proving the optimality, the lower bound of the minimax risk is set by finding a smaller parameter space which contains the most dominant error events, inspired by the analysis in the achievability part. It turns out that the minimax risk decays exponentially fast to zero as the number of nodes tends to infinity, and the rate function is a weighted combination of several divergence terms, each of which is the Renyi divergence of order 1/2 between two Bernoulli distributions. The Bernoulli distributions involved in the characterization of the rate function are those governing the random instantiation of hyperedges in d-hSBM. Experimental results on both synthetic and real-world data validate our theoretical finding.

Cite this Paper


BibTeX
@InProceedings{pmlr-v84-chien18a, title = {Community Detection in Hypergraphs: Optimal Statistical Limit and Efficient Algorithms}, author = {Chien, I and Lin, Chung-Yi and Wang, I-Hsiang}, booktitle = {Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics}, pages = {871--879}, year = {2018}, editor = {Storkey, Amos and Perez-Cruz, Fernando}, volume = {84}, series = {Proceedings of Machine Learning Research}, month = {09--11 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v84/chien18a/chien18a.pdf}, url = {https://proceedings.mlr.press/v84/chien18a.html}, abstract = {In this paper, community detection in hypergraphs is explored. Under a generative hypergraph model called "d-wise hypergraph stochastic block model" (d-hSBM) which naturally extends the Stochastic Block Model (SBM) from graphs to d-uniform hypergraphs, the fundamental limit on the asymptotic minimax misclassified ratio is characterized. For proving the achievability, we propose a two-step polynomial time algorithm that provably achieves the fundamental limit in the sparse hypergraph regime. For proving the optimality, the lower bound of the minimax risk is set by finding a smaller parameter space which contains the most dominant error events, inspired by the analysis in the achievability part. It turns out that the minimax risk decays exponentially fast to zero as the number of nodes tends to infinity, and the rate function is a weighted combination of several divergence terms, each of which is the Renyi divergence of order 1/2 between two Bernoulli distributions. The Bernoulli distributions involved in the characterization of the rate function are those governing the random instantiation of hyperedges in d-hSBM. Experimental results on both synthetic and real-world data validate our theoretical finding.} }
Endnote
%0 Conference Paper %T Community Detection in Hypergraphs: Optimal Statistical Limit and Efficient Algorithms %A I Chien %A Chung-Yi Lin %A I-Hsiang Wang %B Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2018 %E Amos Storkey %E Fernando Perez-Cruz %F pmlr-v84-chien18a %I PMLR %P 871--879 %U https://proceedings.mlr.press/v84/chien18a.html %V 84 %X In this paper, community detection in hypergraphs is explored. Under a generative hypergraph model called "d-wise hypergraph stochastic block model" (d-hSBM) which naturally extends the Stochastic Block Model (SBM) from graphs to d-uniform hypergraphs, the fundamental limit on the asymptotic minimax misclassified ratio is characterized. For proving the achievability, we propose a two-step polynomial time algorithm that provably achieves the fundamental limit in the sparse hypergraph regime. For proving the optimality, the lower bound of the minimax risk is set by finding a smaller parameter space which contains the most dominant error events, inspired by the analysis in the achievability part. It turns out that the minimax risk decays exponentially fast to zero as the number of nodes tends to infinity, and the rate function is a weighted combination of several divergence terms, each of which is the Renyi divergence of order 1/2 between two Bernoulli distributions. The Bernoulli distributions involved in the characterization of the rate function are those governing the random instantiation of hyperedges in d-hSBM. Experimental results on both synthetic and real-world data validate our theoretical finding.
APA
Chien, I., Lin, C. & Wang, I.. (2018). Community Detection in Hypergraphs: Optimal Statistical Limit and Efficient Algorithms. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:871-879 Available from https://proceedings.mlr.press/v84/chien18a.html.

Related Material