[edit]
The Fundamental Limits of Recovering Planted Subgraphs (extended abstract)
Proceedings of Thirty Eighth Conference on Learning Theory, PMLR 291:3578-3579, 2025.
Abstract
Given an arbitrary subgraph $H=H_n$ and $p=p_n\in(0,1)$, the planted subgraph model is defined as follows. A statistician observes the union of the "signal," which is a random "planted" copy $H^*$ of $H$, together with random "noise" in the form of an instance of an Erd{ö}s-R{é}nyi graph $G(n,p)$. The goal then of the statistician is to recover the planted $H^*$ from the observed graph. Our focus in this work is to understand the minimum mean-squared error (MMSE) in terms of recovering the edges of $H^*$, as a function of $p$ and $H$. A recent paper [MNSSZ23] characterizes the graphs for which this MMSE curve undergoes a sharp phase transition from $0$ to $1$ as $p$ increases, a behavior known as the All-or-Nothing phenomenon, up to a mild density assumption on $H$. However, their techniques fail to describe the MMSE curves for graphs that do not display such a sharp phase transition. In this paper, we provide a formula for the limiting MMSE curve for any graph $H=H_n$, up to the same mild density assumption. This curve is expressed in terms of a variational formula over pairs of subgraphs of $H$, and is inspired by the celebrated subgraph expectation thresholds from probabilistic combinatorics [KK07]. Furthermore, we give a polynomial-time description of the optimizers of this variational problem. This allows one to efficiently compute the MMSE curve for any given dense graph $H$. The proof relies on a novel graph decomposition as well as a min-max duality theorem which may be of independent interest. Our results generalize to the setting of planting arbitrary monotone boolean properties, where the statistician observes the union of a planted minimal element $A\subseteq[N]$ of a monotone property and a random $\mathrm{Ber}(p)^{\otimes N}$ vector. In this setting, we provide a variational formula inspired by the so-called "fractional" expectation threshold [Tal10], again describing the MMSE curve (in this case up to a multiplicative constant).