[edit]
Community detection in the hypergraph stochastic block model and reconstruction on hypertrees
Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:2166-2203, 2024.
Abstract
We study the weak recovery problem on the $r$-uniform hypergraph stochastic block model ($r$-HSBM) with two balanced communities. In this model, $n$ vertices are randomly divided into two communities, and size-$r$ hyperedges are added randomly depending on whether all vertices in the hyperedge are in the same community. The goal of weak recovery is to recover a non-trivial fraction of the communities given the hypergraph. Pal and Zhu (2021); Stephan and Zhu (2022) established that weak recovery is always possible above a natural threshold called the Kesten-Stigum (KS) threshold. For assortative models (i.e., monochromatic hyperedges are preferred), Gu and Polyanskiy (2023) proved that the KS threshold is tight if $r\le 4$ or the expected degree $d$ is small. For other cases, the tightness of the KS threshold remained open. In this paper we determine the tightness of the KS threshold for a wide range of parameters. We prove that for $r\le 6$ and $d$ large enough, the KS threshold is tight. This shows that there is no information-computation gap in this regime and partially confirms a conjecture of Angelini et al. (2015). On the other hand, we show that for $r\ge 5$, there exist parameters for which the KS threshold is not tight. In particular, for $r\ge 7$, the KS threshold is not tight if the model is disassortative (i.e., polychromatic hyperedges are preferred) or $d$ is large enough. This provides more evidence supporting the existence of an information-computation gap in these cases. Furthermore, we establish asymptotic bounds on the weak recovery threshold for fixed $r$ and large $d$. We also obtain a number of results regarding the broadcasting on hypertrees (BOHT) model, including the asymptotics of the reconstruction threshold for $r\ge 7$ and impossibility of robust reconstruction at criticality.