Community detection in the hypergraph stochastic block model and reconstruction on hypertrees

Yuzhou Gu, Aaradhya Pandey
Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:2166-2203, 2024.

Abstract

We study the weak recovery problem on the r-uniform hypergraph stochastic block model (r-HSBM) with two balanced communities. In this model, n vertices are randomly divided into two communities, and size-r hyperedges are added randomly depending on whether all vertices in the hyperedge are in the same community. The goal of weak recovery is to recover a non-trivial fraction of the communities given the hypergraph. Pal and Zhu (2021); Stephan and Zhu (2022) established that weak recovery is always possible above a natural threshold called the Kesten-Stigum (KS) threshold. For assortative models (i.e., monochromatic hyperedges are preferred), Gu and Polyanskiy (2023) proved that the KS threshold is tight if r4 or the expected degree d is small. For other cases, the tightness of the KS threshold remained open. In this paper we determine the tightness of the KS threshold for a wide range of parameters. We prove that for r6 and d large enough, the KS threshold is tight. This shows that there is no information-computation gap in this regime and partially confirms a conjecture of Angelini et al. (2015). On the other hand, we show that for r5, there exist parameters for which the KS threshold is not tight. In particular, for r7, the KS threshold is not tight if the model is disassortative (i.e., polychromatic hyperedges are preferred) or d is large enough. This provides more evidence supporting the existence of an information-computation gap in these cases. Furthermore, we establish asymptotic bounds on the weak recovery threshold for fixed r and large d. We also obtain a number of results regarding the broadcasting on hypertrees (BOHT) model, including the asymptotics of the reconstruction threshold for r7 and impossibility of robust reconstruction at criticality.

Cite this Paper


BibTeX
@InProceedings{pmlr-v247-gu24a, title = {Community detection in the hypergraph stochastic block model and reconstruction on hypertrees}, author = {Gu, Yuzhou and Pandey, Aaradhya}, booktitle = {Proceedings of Thirty Seventh Conference on Learning Theory}, pages = {2166--2203}, year = {2024}, editor = {Agrawal, Shipra and Roth, Aaron}, volume = {247}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v247/gu24a/gu24a.pdf}, url = {https://proceedings.mlr.press/v247/gu24a.html}, abstract = {We study the weak recovery problem on the $r$-uniform hypergraph stochastic block model ($r$-HSBM) with two balanced communities. In this model, $n$ vertices are randomly divided into two communities, and size-$r$ hyperedges are added randomly depending on whether all vertices in the hyperedge are in the same community. The goal of weak recovery is to recover a non-trivial fraction of the communities given the hypergraph. Pal and Zhu (2021); Stephan and Zhu (2022) established that weak recovery is always possible above a natural threshold called the Kesten-Stigum (KS) threshold. For assortative models (i.e., monochromatic hyperedges are preferred), Gu and Polyanskiy (2023) proved that the KS threshold is tight if $r\le 4$ or the expected degree $d$ is small. For other cases, the tightness of the KS threshold remained open. In this paper we determine the tightness of the KS threshold for a wide range of parameters. We prove that for $r\le 6$ and $d$ large enough, the KS threshold is tight. This shows that there is no information-computation gap in this regime and partially confirms a conjecture of Angelini et al. (2015). On the other hand, we show that for $r\ge 5$, there exist parameters for which the KS threshold is not tight. In particular, for $r\ge 7$, the KS threshold is not tight if the model is disassortative (i.e., polychromatic hyperedges are preferred) or $d$ is large enough. This provides more evidence supporting the existence of an information-computation gap in these cases. Furthermore, we establish asymptotic bounds on the weak recovery threshold for fixed $r$ and large $d$. We also obtain a number of results regarding the broadcasting on hypertrees (BOHT) model, including the asymptotics of the reconstruction threshold for $r\ge 7$ and impossibility of robust reconstruction at criticality.} }
Endnote
%0 Conference Paper %T Community detection in the hypergraph stochastic block model and reconstruction on hypertrees %A Yuzhou Gu %A Aaradhya Pandey %B Proceedings of Thirty Seventh Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2024 %E Shipra Agrawal %E Aaron Roth %F pmlr-v247-gu24a %I PMLR %P 2166--2203 %U https://proceedings.mlr.press/v247/gu24a.html %V 247 %X We study the weak recovery problem on the $r$-uniform hypergraph stochastic block model ($r$-HSBM) with two balanced communities. In this model, $n$ vertices are randomly divided into two communities, and size-$r$ hyperedges are added randomly depending on whether all vertices in the hyperedge are in the same community. The goal of weak recovery is to recover a non-trivial fraction of the communities given the hypergraph. Pal and Zhu (2021); Stephan and Zhu (2022) established that weak recovery is always possible above a natural threshold called the Kesten-Stigum (KS) threshold. For assortative models (i.e., monochromatic hyperedges are preferred), Gu and Polyanskiy (2023) proved that the KS threshold is tight if $r\le 4$ or the expected degree $d$ is small. For other cases, the tightness of the KS threshold remained open. In this paper we determine the tightness of the KS threshold for a wide range of parameters. We prove that for $r\le 6$ and $d$ large enough, the KS threshold is tight. This shows that there is no information-computation gap in this regime and partially confirms a conjecture of Angelini et al. (2015). On the other hand, we show that for $r\ge 5$, there exist parameters for which the KS threshold is not tight. In particular, for $r\ge 7$, the KS threshold is not tight if the model is disassortative (i.e., polychromatic hyperedges are preferred) or $d$ is large enough. This provides more evidence supporting the existence of an information-computation gap in these cases. Furthermore, we establish asymptotic bounds on the weak recovery threshold for fixed $r$ and large $d$. We also obtain a number of results regarding the broadcasting on hypertrees (BOHT) model, including the asymptotics of the reconstruction threshold for $r\ge 7$ and impossibility of robust reconstruction at criticality.
APA
Gu, Y. & Pandey, A.. (2024). Community detection in the hypergraph stochastic block model and reconstruction on hypertrees. Proceedings of Thirty Seventh Conference on Learning Theory, in Proceedings of Machine Learning Research 247:2166-2203 Available from https://proceedings.mlr.press/v247/gu24a.html.

Related Material