[edit]
Reducibility and Computational Lower Bounds for Problems with Planted Sparse Structure
Proceedings of the 31st Conference On Learning Theory, PMLR 75:48-166, 2018.
Abstract
Recently, research in unsupervised learning has gravitated towards exploring statistical-computational gaps induced by sparsity. A line of work initiated in Berthet and Rigollet (2013) has aimed to explain these gaps through reductions to conjecturally hard problems from complexity theory. However, the delicate nature of average-case reductions has limited the development of techniques and often led to weaker hardness results that only apply to algorithms robust to different noise distributions or that do not need to know the parameters of the problem. We introduce several new techniques to give a web of average-case reductions showing strong computational lower bounds based on the planted clique conjecture. Our new lower bounds include:
- Planted Independent Set: We show tight lower bounds for detecting a planted independent set of size $k$ in a sparse Erdős-Rényi graph of size $n$ with edge density $\tilde{\Theta}(n^{-\alpha})$.
- Planted Dense Subgraph: If $p > q$ are the edge densities inside and outside of the community, we show the first lower bounds for the general regime $q = \tilde{\Theta}(n^{-\alpha})$ and $p - q = \tilde{\Theta}(n^{-\gamma})$ where $\gamma \ge \alpha$, matching the lower bounds predicted in Chen and Xu (2016). Our lower bounds apply to a deterministic community size $k$, resolving a question raised in Hajek et al. (2015).
- Biclustering: We show strong lower bounds for Gaussian biclustering as a simple hypothesis testing problem to detect a uniformly at random planted flat $k \times k$ submatrix.
- Sparse Rank-1 Submatrix: We show that detection in the sparse spiked Wigner model is often harder than biclustering, and are able to obtain two different tight lower bounds for these problems with different reductions from planted clique.
- Sparse PCA: We give a reduction between rank-1 submatrix and sparse PCA to obtain tight lower bounds in the less sparse regime $k \gg \sqrt{n}$, when the spectral algorithm is optimal over the SDP. We give an alternate reduction recovering the lower bounds of Berthet and Rigollet (2013) and Gao et al. (2017) in the simple hypothesis testing variant of sparse PCA. We also observe a subtlety in the complexity of sparse PCA that arises when the planted vector is biased.
- Subgraph Stochastic Block Model: We introduce a model where two small communities are planted in an Erdős-Rényi graph of the same average edge density and give tight lower bounds yielding different hard regimes than planted dense subgraph.