Recovery thresholds for hidden weighted sparse graphs (extended abstract)

Zhe Hou; Jingcheng Liu

Recovery thresholds for hidden weighted sparse graphs (extended abstract)

Zhe Hou, Jingcheng Liu

Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:3280-3284, 2026.

Abstract

Recovering structural information from noisy high-dimensional data is a fundamental task in statistical inference. We investigate the information-theoretic recovery thresholds for a graph hidden in a randomly weighted complete graph. Specifically, an unknown graph $H^* \in \mathcal H_n$ is chosen uniformly at random, and hidden in a complete graph of $n$ vertices through edge weights: for an edge $e \in H^*$, its weight is distributed independently according to $\mathcal{P}_n$; otherwise the edge weight is distributed independently according to $\mathcal{Q}_n$. The goal is to recover the hidden edge set from these noisy observations. Our primary focus on \emph{almost exact recovery}, where all but a vanishing fraction of the edges of $H^*$ must be identified. By choosing $\mathcal{P} = \mathrm{Bern}(1), \mathcal{Q} = \mathrm{Bern}(q)$, this model captures the well-studied planted Erdős-Rényi recovery setting, and other weighted formulations such as Gaussian and Exponential distributions. Assuming a local Lipschitzness of the Rényi divergence between distributions $\mathcal{P}_n$ and $\mathcal{Q}_n$, and a mild density condition for the graphs $\mathcal H_n$, we give a unified characterization of the information-theoretic limit for recovering almost all of $H$ (also known as almost exact recovery). Our characterization uses the KL divergence $D_{\mathrm{KL}}(\mathcal{P}_n\|\mathcal{Q}_n)$ as the signal-to-noise metric. We show that there is a critical threshold $D_c$ governed by the logarithm of the first moment threshold of $\mathcal H_n$ in the Erdős-Rényi random graph model $G(n,p)$. For any constant $\eta$, if $D_{\mathrm{KL}} \ge (1+\eta)D_c$, the maximum likelihood estimator (MLE) achieves almost exact recovery; conversely if $D_{\mathrm{KL}} \le (1-\eta)D_c$, almost exact recovery is information-theoretically impossible. Our MLE analysis is based on an extension of (Jian Ding, Yihong Wu, Jiaming Xu, Dana Yang, 2020); While our lower bound involves deriving a distance-based Fano-type inequality based on Sibson’s $\alpha$-mutual information, and it also extends to the task of partial recovery, in which only a constant $\lambda$-fraction of the hidden graph is required to be recovered. As a result, if $D_{\mathrm{KL}} \le (\lambda-\eta)D_c$, we also proved that the recovery of any $\lambda$-fraction must fail with high probability. Another interesting phenomenon exhibited by some of these models is the “All-or-Nothing” (AoN), where one can either recover almost all of the hidden graph or essentially nothing at all. For several natural distributional families, including certain Gaussian, Bernoulli and Exponential regimes, we establish an All-or-Nothing (AoN) threshold phenomenon at the exponential scale. For Gaussians, we obtained an AoN by lifting the sharp almost exact recovery threshold through the I-MMSE relation; For the others, we combine a second-moment argument inspired by planted subgraph recovery (Elchanan Mossel, Jonathan Niles-Weed, Youngtak Sohn, Nike Sun, Ilias Zadik, 2023) with the second-order Rényi divergence. We also provide distributional examples where AoN is not universal at this level of generality, and our partial recovery lower bound is already the best possible.

Cite this Paper

BibTeX

@InProceedings{pmlr-v336-hou26a,
  title = 	 {Recovery thresholds for hidden weighted sparse graphs (extended abstract)},
  author =       {Hou, Zhe and Liu, Jingcheng},
  booktitle = 	 {Proceedings of Thirty Ninth Conference on Learning Theory},
  pages = 	 {3280--3284},
  year = 	 {2026},
  editor = 	 {Hanneke, Steve and Lattimore, Tor},
  volume = 	 {336},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {29 Jun--03 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v336/main/assets/hou26a/hou26a.pdf},
  url = 	 {https://proceedings.mlr.press/v336/hou26a.html},
  abstract = 	 {Recovering structural information from noisy high-dimensional data is a fundamental task in statistical inference. We investigate the information-theoretic recovery thresholds for a graph hidden in a randomly weighted complete graph. Specifically, an unknown graph $H^* \in \mathcal H_n$ is chosen uniformly at random, and hidden in a complete graph of $n$ vertices through edge weights: for an edge $e \in H^*$, its weight is distributed independently according to $\mathcal{P}_n$; otherwise the edge weight is distributed independently according to $\mathcal{Q}_n$. The goal is to recover the hidden edge set from these noisy observations. Our primary focus on \emph{almost exact recovery}, where all but a vanishing fraction of the edges of $H^*$ must be identified. By choosing $\mathcal{P} = \mathrm{Bern}(1), \mathcal{Q} = \mathrm{Bern}(q)$, this model captures the well-studied planted Erdős-Rényi recovery setting, and other weighted formulations such as Gaussian and Exponential distributions. Assuming a local Lipschitzness of the Rényi divergence between distributions $\mathcal{P}_n$ and $\mathcal{Q}_n$, and a mild density condition for the graphs $\mathcal H_n$, we give a unified characterization of the information-theoretic limit for recovering almost all of $H$ (also known as almost exact recovery). Our characterization uses the KL divergence $D_{\mathrm{KL}}(\mathcal{P}_n\|\mathcal{Q}_n)$ as the signal-to-noise metric. We show that there is a critical threshold $D_c$ governed by the logarithm of the first moment threshold of $\mathcal H_n$ in the Erdős-Rényi random graph model $G(n,p)$. For any constant $\eta$, if $D_{\mathrm{KL}} \ge (1+\eta)D_c$, the maximum likelihood estimator (MLE) achieves almost exact recovery; conversely if $D_{\mathrm{KL}} \le (1-\eta)D_c$, almost exact recovery is information-theoretically impossible. Our MLE analysis is based on an extension of (Jian Ding, Yihong Wu, Jiaming Xu, Dana Yang, 2020); While our lower bound involves deriving a distance-based Fano-type inequality based on Sibson’s $\alpha$-mutual information, and it also extends to the task of partial recovery, in which only a constant $\lambda$-fraction of the hidden graph is required to be recovered. As a result, if $D_{\mathrm{KL}} \le (\lambda-\eta)D_c$, we also proved that the recovery of any $\lambda$-fraction must fail with high probability. Another interesting phenomenon exhibited by some of these models is the “All-or-Nothing” (AoN), where one can either recover almost all of the hidden graph or essentially nothing at all. For several natural distributional families, including  certain Gaussian, Bernoulli and Exponential regimes, we establish an All-or-Nothing (AoN) threshold phenomenon at the exponential scale. For Gaussians, we obtained an AoN by lifting the sharp almost exact recovery threshold through the I-MMSE relation; For the others, we combine a second-moment argument inspired by planted subgraph recovery (Elchanan Mossel, Jonathan Niles-Weed, Youngtak Sohn, Nike Sun, Ilias Zadik, 2023) with the second-order Rényi divergence.  We also provide distributional examples where AoN is not universal at this level of generality, and our partial recovery lower bound is already the best possible.}
}

Endnote

%0 Conference Paper
%T Recovery thresholds for hidden weighted sparse graphs (extended abstract)
%A Zhe Hou
%A Jingcheng Liu
%B Proceedings of Thirty Ninth Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2026
%E Steve Hanneke
%E Tor Lattimore	
%F pmlr-v336-hou26a
%I PMLR
%P 3280--3284
%U https://proceedings.mlr.press/v336/hou26a.html
%V 336
%X Recovering structural information from noisy high-dimensional data is a fundamental task in statistical inference. We investigate the information-theoretic recovery thresholds for a graph hidden in a randomly weighted complete graph. Specifically, an unknown graph $H^* \in \mathcal H_n$ is chosen uniformly at random, and hidden in a complete graph of $n$ vertices through edge weights: for an edge $e \in H^*$, its weight is distributed independently according to $\mathcal{P}_n$; otherwise the edge weight is distributed independently according to $\mathcal{Q}_n$. The goal is to recover the hidden edge set from these noisy observations. Our primary focus on \emph{almost exact recovery}, where all but a vanishing fraction of the edges of $H^*$ must be identified. By choosing $\mathcal{P} = \mathrm{Bern}(1), \mathcal{Q} = \mathrm{Bern}(q)$, this model captures the well-studied planted Erdős-Rényi recovery setting, and other weighted formulations such as Gaussian and Exponential distributions. Assuming a local Lipschitzness of the Rényi divergence between distributions $\mathcal{P}_n$ and $\mathcal{Q}_n$, and a mild density condition for the graphs $\mathcal H_n$, we give a unified characterization of the information-theoretic limit for recovering almost all of $H$ (also known as almost exact recovery). Our characterization uses the KL divergence $D_{\mathrm{KL}}(\mathcal{P}_n\|\mathcal{Q}_n)$ as the signal-to-noise metric. We show that there is a critical threshold $D_c$ governed by the logarithm of the first moment threshold of $\mathcal H_n$ in the Erdős-Rényi random graph model $G(n,p)$. For any constant $\eta$, if $D_{\mathrm{KL}} \ge (1+\eta)D_c$, the maximum likelihood estimator (MLE) achieves almost exact recovery; conversely if $D_{\mathrm{KL}} \le (1-\eta)D_c$, almost exact recovery is information-theoretically impossible. Our MLE analysis is based on an extension of (Jian Ding, Yihong Wu, Jiaming Xu, Dana Yang, 2020); While our lower bound involves deriving a distance-based Fano-type inequality based on Sibson’s $\alpha$-mutual information, and it also extends to the task of partial recovery, in which only a constant $\lambda$-fraction of the hidden graph is required to be recovered. As a result, if $D_{\mathrm{KL}} \le (\lambda-\eta)D_c$, we also proved that the recovery of any $\lambda$-fraction must fail with high probability. Another interesting phenomenon exhibited by some of these models is the “All-or-Nothing” (AoN), where one can either recover almost all of the hidden graph or essentially nothing at all. For several natural distributional families, including  certain Gaussian, Bernoulli and Exponential regimes, we establish an All-or-Nothing (AoN) threshold phenomenon at the exponential scale. For Gaussians, we obtained an AoN by lifting the sharp almost exact recovery threshold through the I-MMSE relation; For the others, we combine a second-moment argument inspired by planted subgraph recovery (Elchanan Mossel, Jonathan Niles-Weed, Youngtak Sohn, Nike Sun, Ilias Zadik, 2023) with the second-order Rényi divergence.  We also provide distributional examples where AoN is not universal at this level of generality, and our partial recovery lower bound is already the best possible.

APA

Hou, Z. & Liu, J.. (2026). Recovery thresholds for hidden weighted sparse graphs (extended abstract). Proceedings of Thirty Ninth Conference on Learning Theory, in Proceedings of Machine Learning Research 336:3280-3284 Available from https://proceedings.mlr.press/v336/hou26a.html.

Related Material

Download PDF