Community detection in the hypergraph stochastic block model and reconstruction on hypertrees

Yuzhou Gu, Aaradhya Pandey
Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:2166-2203, 2024.

Abstract

We study the weak recovery problem on the $r$-uniform hypergraph stochastic block model ($r$-HSBM) with two balanced communities. In this model, $n$ vertices are randomly divided into two communities, and size-$r$ hyperedges are added randomly depending on whether all vertices in the hyperedge are in the same community. The goal of weak recovery is to recover a non-trivial fraction of the communities given the hypergraph. Pal and Zhu (2021); Stephan and Zhu (2022) established that weak recovery is always possible above a natural threshold called the Kesten-Stigum (KS) threshold. For assortative models (i.e., monochromatic hyperedges are preferred), Gu and Polyanskiy (2023) proved that the KS threshold is tight if $r\le 4$ or the expected degree $d$ is small. For other cases, the tightness of the KS threshold remained open. In this paper we determine the tightness of the KS threshold for a wide range of parameters. We prove that for $r\le 6$ and $d$ large enough, the KS threshold is tight. This shows that there is no information-computation gap in this regime and partially confirms a conjecture of Angelini et al. (2015). On the other hand, we show that for $r\ge 5$, there exist parameters for which the KS threshold is not tight. In particular, for $r\ge 7$, the KS threshold is not tight if the model is disassortative (i.e., polychromatic hyperedges are preferred) or $d$ is large enough. This provides more evidence supporting the existence of an information-computation gap in these cases. Furthermore, we establish asymptotic bounds on the weak recovery threshold for fixed $r$ and large $d$. We also obtain a number of results regarding the broadcasting on hypertrees (BOHT) model, including the asymptotics of the reconstruction threshold for $r\ge 7$ and impossibility of robust reconstruction at criticality.

Cite this Paper


BibTeX
@InProceedings{pmlr-v247-gu24a, title = {Community detection in the hypergraph stochastic block model and reconstruction on hypertrees}, author = {Gu, Yuzhou and Pandey, Aaradhya}, booktitle = {Proceedings of Thirty Seventh Conference on Learning Theory}, pages = {2166--2203}, year = {2024}, editor = {Agrawal, Shipra and Roth, Aaron}, volume = {247}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v247/gu24a/gu24a.pdf}, url = {https://proceedings.mlr.press/v247/gu24a.html}, abstract = {We study the weak recovery problem on the $r$-uniform hypergraph stochastic block model ($r$-HSBM) with two balanced communities. In this model, $n$ vertices are randomly divided into two communities, and size-$r$ hyperedges are added randomly depending on whether all vertices in the hyperedge are in the same community. The goal of weak recovery is to recover a non-trivial fraction of the communities given the hypergraph. Pal and Zhu (2021); Stephan and Zhu (2022) established that weak recovery is always possible above a natural threshold called the Kesten-Stigum (KS) threshold. For assortative models (i.e., monochromatic hyperedges are preferred), Gu and Polyanskiy (2023) proved that the KS threshold is tight if $r\le 4$ or the expected degree $d$ is small. For other cases, the tightness of the KS threshold remained open. In this paper we determine the tightness of the KS threshold for a wide range of parameters. We prove that for $r\le 6$ and $d$ large enough, the KS threshold is tight. This shows that there is no information-computation gap in this regime and partially confirms a conjecture of Angelini et al. (2015). On the other hand, we show that for $r\ge 5$, there exist parameters for which the KS threshold is not tight. In particular, for $r\ge 7$, the KS threshold is not tight if the model is disassortative (i.e., polychromatic hyperedges are preferred) or $d$ is large enough. This provides more evidence supporting the existence of an information-computation gap in these cases. Furthermore, we establish asymptotic bounds on the weak recovery threshold for fixed $r$ and large $d$. We also obtain a number of results regarding the broadcasting on hypertrees (BOHT) model, including the asymptotics of the reconstruction threshold for $r\ge 7$ and impossibility of robust reconstruction at criticality.} }
Endnote
%0 Conference Paper %T Community detection in the hypergraph stochastic block model and reconstruction on hypertrees %A Yuzhou Gu %A Aaradhya Pandey %B Proceedings of Thirty Seventh Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2024 %E Shipra Agrawal %E Aaron Roth %F pmlr-v247-gu24a %I PMLR %P 2166--2203 %U https://proceedings.mlr.press/v247/gu24a.html %V 247 %X We study the weak recovery problem on the $r$-uniform hypergraph stochastic block model ($r$-HSBM) with two balanced communities. In this model, $n$ vertices are randomly divided into two communities, and size-$r$ hyperedges are added randomly depending on whether all vertices in the hyperedge are in the same community. The goal of weak recovery is to recover a non-trivial fraction of the communities given the hypergraph. Pal and Zhu (2021); Stephan and Zhu (2022) established that weak recovery is always possible above a natural threshold called the Kesten-Stigum (KS) threshold. For assortative models (i.e., monochromatic hyperedges are preferred), Gu and Polyanskiy (2023) proved that the KS threshold is tight if $r\le 4$ or the expected degree $d$ is small. For other cases, the tightness of the KS threshold remained open. In this paper we determine the tightness of the KS threshold for a wide range of parameters. We prove that for $r\le 6$ and $d$ large enough, the KS threshold is tight. This shows that there is no information-computation gap in this regime and partially confirms a conjecture of Angelini et al. (2015). On the other hand, we show that for $r\ge 5$, there exist parameters for which the KS threshold is not tight. In particular, for $r\ge 7$, the KS threshold is not tight if the model is disassortative (i.e., polychromatic hyperedges are preferred) or $d$ is large enough. This provides more evidence supporting the existence of an information-computation gap in these cases. Furthermore, we establish asymptotic bounds on the weak recovery threshold for fixed $r$ and large $d$. We also obtain a number of results regarding the broadcasting on hypertrees (BOHT) model, including the asymptotics of the reconstruction threshold for $r\ge 7$ and impossibility of robust reconstruction at criticality.
APA
Gu, Y. & Pandey, A.. (2024). Community detection in the hypergraph stochastic block model and reconstruction on hypertrees. Proceedings of Thirty Seventh Conference on Learning Theory, in Proceedings of Machine Learning Research 247:2166-2203 Available from https://proceedings.mlr.press/v247/gu24a.html.

Related Material