[edit]
Planting trees in graphs, and finding them back
Proceedings of the Thirty-Second Conference on Learning Theory, PMLR 99:2341-2371, 2019.
Abstract
In this paper we study the two inference problems of detection and reconstruction in the context of planted structures in sparse Erdős-Rényi random graphs G(n,λ/n) with fixed average degree λ>0. Motivated by a problem of communication security, we focus on the case where the planted structure consists in the addition of a tree graph. In the case of planted line graphs, we establish the following phase diagram for detection and reconstruction. In a low density region where the average degree λ of the original graph is below some critical value λc=1, both detection and reconstruction go from impossible to easy as the line length K crosses some critical value K∗=ln(n)/ln(1/λ), where n is the number of nodes in the graph. In a high density region where λ>λc, detection goes from impossible to easy as K goes from o(√n) to ω(√n). In contrast, reconstruction remains impossible so long as K=o(n). We then consider planted D-ary trees of varying depth h and 2≤D≤O(1). For these we identify a low-density region λ<λD, where λD is the threshold for emergence of the D-core in Erdős-Rényi random graphs G(n,λ/n) for which the following holds. There is a threshold h∗=g(D)ln(ln(n)) with the following properties. Detection goes from impossible to feasible as h crosses h∗. Interestingly, we show that only partial reconstruction is feasible at best for h≥h∗. We conjecture a similar picture to hold for D-ary trees as for lines in the high-density region λ>λD, but confirm only the following part of this picture: Detection is easy for D-ary trees of size ω(√n), while at best only partial reconstruction is feasible for D-ary trees of any size o(n). These results provide a clear contrast with the corresponding picture for detection and reconstruction of {\em low rank} planted structures, such as dense subgraphs and block communities. In the examples we study, there is i) an absence of hard phases for both detection and reconstruction, and ii) a discrepancy between detection and reconstruction, the latter being impossible for a wide range of parameters where detection is easy. The latter property does not hold for previously studied low rank planted structures.