Semi-Random Sparse Recovery in Nearly-Linear Time

Jonathan Kelner, Jerry Li, Allen X. Liu, Aaron Sidford, Kevin Tian
Proceedings of Thirty Sixth Conference on Learning Theory, PMLR 195:2352-2398, 2023.

Abstract

Sparse recovery is one of the most fundamental and well-studied inverse problems.Standard statistical formulations of the problem are provably solved by general convex programming techniques and more practical, fast (nearly-linear time) iterative methods. However, these latter “fast algorithms” have previously been observed to be brittle in various real-world settings.We investigate the brittleness of fast sparse recovery algorithms to generative model changes through the lens of studying their robustness to a “helpful” semi-random adversary, a framework for testing overfitting to input assumptions. We consider the following basic model: let $\mathbf{A} \in \mathbb{R}^{n \times d}$ be a measurement matrix containing an unknown subset of rows $\mathbf{G} \in \mathb{R}^{m \times d}$ which are bounded and satisfy the restricted isometry property (RIP), but is otherwise arbitrary. Letting $x^\star \in \mathbb{R}^d$ be $s$-sparse, and given either exact or noisy measurements, $b = \mathbf{A} x^\star$ or $b = \mathbf{A} x^\star + \xi$, we design algorithms recovering $x^\star$ information-theoretically optimally in nearly-linear time. We extend our algorithm to hold for weaker generative models relaxing our planted RIP row subset assumption to a natural weighted variant, and show that our method’s guarantees naturally interpolate the quality of the measurement matrix to, in some parameter regimes, run in sublinear time.Our approach differs from that of prior fast iterative methods with provable guarantees under semi-random generative models [CG18, LSTZ20], which typically separate the problem of learning the planted instance from the estimation problem, i.e. they attempt to first learn the planted “good” instance (in our case, the matrix $\mathbf{G}$). However, natural conditions on a submatrix which make sparse recovery tractable, such as RIP, are NP-hard to verify and hence first learning a sufficient row reweighting appears challenging. We eschew this approach and design a new iterative method, tailored to the geometry of sparse recovery, which is provably robust to our semi-random model. Our hope is that our approach opens the door to new robust, efficient algorithms for other natural statistical inverse problems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v195-kelner23a, title = {Semi-Random Sparse Recovery in Nearly-Linear Time}, author = {Kelner, Jonathan and Li, Jerry and Liu, Allen X. and Sidford, Aaron and Tian, Kevin}, booktitle = {Proceedings of Thirty Sixth Conference on Learning Theory}, pages = {2352--2398}, year = {2023}, editor = {Neu, Gergely and Rosasco, Lorenzo}, volume = {195}, series = {Proceedings of Machine Learning Research}, month = {12--15 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v195/kelner23a/kelner23a.pdf}, url = {https://proceedings.mlr.press/v195/kelner23a.html}, abstract = {Sparse recovery is one of the most fundamental and well-studied inverse problems.Standard statistical formulations of the problem are provably solved by general convex programming techniques and more practical, fast (nearly-linear time) iterative methods. However, these latter “fast algorithms” have previously been observed to be brittle in various real-world settings.We investigate the brittleness of fast sparse recovery algorithms to generative model changes through the lens of studying their robustness to a “helpful” semi-random adversary, a framework for testing overfitting to input assumptions. We consider the following basic model: let $\mathbf{A} \in \mathbb{R}^{n \times d}$ be a measurement matrix containing an unknown subset of rows $\mathbf{G} \in \mathb{R}^{m \times d}$ which are bounded and satisfy the restricted isometry property (RIP), but is otherwise arbitrary. Letting $x^\star \in \mathbb{R}^d$ be $s$-sparse, and given either exact or noisy measurements, $b = \mathbf{A} x^\star$ or $b = \mathbf{A} x^\star + \xi$, we design algorithms recovering $x^\star$ information-theoretically optimally in nearly-linear time. We extend our algorithm to hold for weaker generative models relaxing our planted RIP row subset assumption to a natural weighted variant, and show that our method’s guarantees naturally interpolate the quality of the measurement matrix to, in some parameter regimes, run in sublinear time.Our approach differs from that of prior fast iterative methods with provable guarantees under semi-random generative models [CG18, LSTZ20], which typically separate the problem of learning the planted instance from the estimation problem, i.e. they attempt to first learn the planted “good” instance (in our case, the matrix $\mathbf{G}$). However, natural conditions on a submatrix which make sparse recovery tractable, such as RIP, are NP-hard to verify and hence first learning a sufficient row reweighting appears challenging. We eschew this approach and design a new iterative method, tailored to the geometry of sparse recovery, which is provably robust to our semi-random model. Our hope is that our approach opens the door to new robust, efficient algorithms for other natural statistical inverse problems.} }
Endnote
%0 Conference Paper %T Semi-Random Sparse Recovery in Nearly-Linear Time %A Jonathan Kelner %A Jerry Li %A Allen X. Liu %A Aaron Sidford %A Kevin Tian %B Proceedings of Thirty Sixth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2023 %E Gergely Neu %E Lorenzo Rosasco %F pmlr-v195-kelner23a %I PMLR %P 2352--2398 %U https://proceedings.mlr.press/v195/kelner23a.html %V 195 %X Sparse recovery is one of the most fundamental and well-studied inverse problems.Standard statistical formulations of the problem are provably solved by general convex programming techniques and more practical, fast (nearly-linear time) iterative methods. However, these latter “fast algorithms” have previously been observed to be brittle in various real-world settings.We investigate the brittleness of fast sparse recovery algorithms to generative model changes through the lens of studying their robustness to a “helpful” semi-random adversary, a framework for testing overfitting to input assumptions. We consider the following basic model: let $\mathbf{A} \in \mathbb{R}^{n \times d}$ be a measurement matrix containing an unknown subset of rows $\mathbf{G} \in \mathb{R}^{m \times d}$ which are bounded and satisfy the restricted isometry property (RIP), but is otherwise arbitrary. Letting $x^\star \in \mathbb{R}^d$ be $s$-sparse, and given either exact or noisy measurements, $b = \mathbf{A} x^\star$ or $b = \mathbf{A} x^\star + \xi$, we design algorithms recovering $x^\star$ information-theoretically optimally in nearly-linear time. We extend our algorithm to hold for weaker generative models relaxing our planted RIP row subset assumption to a natural weighted variant, and show that our method’s guarantees naturally interpolate the quality of the measurement matrix to, in some parameter regimes, run in sublinear time.Our approach differs from that of prior fast iterative methods with provable guarantees under semi-random generative models [CG18, LSTZ20], which typically separate the problem of learning the planted instance from the estimation problem, i.e. they attempt to first learn the planted “good” instance (in our case, the matrix $\mathbf{G}$). However, natural conditions on a submatrix which make sparse recovery tractable, such as RIP, are NP-hard to verify and hence first learning a sufficient row reweighting appears challenging. We eschew this approach and design a new iterative method, tailored to the geometry of sparse recovery, which is provably robust to our semi-random model. Our hope is that our approach opens the door to new robust, efficient algorithms for other natural statistical inverse problems.
APA
Kelner, J., Li, J., Liu, A.X., Sidford, A. & Tian, K.. (2023). Semi-Random Sparse Recovery in Nearly-Linear Time. Proceedings of Thirty Sixth Conference on Learning Theory, in Proceedings of Machine Learning Research 195:2352-2398 Available from https://proceedings.mlr.press/v195/kelner23a.html.

Related Material