Fast Non-convex Matrix Sensing with Optimal Sample Complexity

Jian-Feng Cai, Tong Wu, Ruizhe Xia
Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, PMLR 286:497-520, 2025.

Abstract

We study the problem of recovering an unknown $d_1 \times d_2$ rank-$r$ matrix from $m$ random linear measurements. Convex methods achieve the optimal sample complexity $m = \Omega(r(d_1 + d_2))$ but are computationally expensive. Non-convex approaches, while more computationally efficient, often require suboptimal sample complexity $m = \Omega(r^2(d_1 + d_2))$. Recent advance achieves $m = \Omega(rd_1)$ for a non-convex approach but relies on the restrictive assumption of positive semidefinite (PSD) matrices and suffers from slow convergence in ill-conditioned settings. Bridging this gap, we show that Riemannian gradient descent (RGD) achieves both optimal sample complexity and computational efficiency without requiring the PSD assumption. Specifically, for Gaussian measurements, RGD exactly recovers the low-rank matrix with $m = \Omega(r(d_1 + d_2))$, matching the information-theoretic lower bound, and converges linearly to the global minimum with an arbitrarily small convergence rate.

Cite this Paper


BibTeX
@InProceedings{pmlr-v286-cai25b, title = {Fast Non-convex Matrix Sensing with Optimal Sample Complexity}, author = {Cai, Jian-Feng and Wu, Tong and Xia, Ruizhe}, booktitle = {Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence}, pages = {497--520}, year = {2025}, editor = {Chiappa, Silvia and Magliacane, Sara}, volume = {286}, series = {Proceedings of Machine Learning Research}, month = {21--25 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v286/main/assets/cai25b/cai25b.pdf}, url = {https://proceedings.mlr.press/v286/cai25b.html}, abstract = {We study the problem of recovering an unknown $d_1 \times d_2$ rank-$r$ matrix from $m$ random linear measurements. Convex methods achieve the optimal sample complexity $m = \Omega(r(d_1 + d_2))$ but are computationally expensive. Non-convex approaches, while more computationally efficient, often require suboptimal sample complexity $m = \Omega(r^2(d_1 + d_2))$. Recent advance achieves $m = \Omega(rd_1)$ for a non-convex approach but relies on the restrictive assumption of positive semidefinite (PSD) matrices and suffers from slow convergence in ill-conditioned settings. Bridging this gap, we show that Riemannian gradient descent (RGD) achieves both optimal sample complexity and computational efficiency without requiring the PSD assumption. Specifically, for Gaussian measurements, RGD exactly recovers the low-rank matrix with $m = \Omega(r(d_1 + d_2))$, matching the information-theoretic lower bound, and converges linearly to the global minimum with an arbitrarily small convergence rate.} }
Endnote
%0 Conference Paper %T Fast Non-convex Matrix Sensing with Optimal Sample Complexity %A Jian-Feng Cai %A Tong Wu %A Ruizhe Xia %B Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2025 %E Silvia Chiappa %E Sara Magliacane %F pmlr-v286-cai25b %I PMLR %P 497--520 %U https://proceedings.mlr.press/v286/cai25b.html %V 286 %X We study the problem of recovering an unknown $d_1 \times d_2$ rank-$r$ matrix from $m$ random linear measurements. Convex methods achieve the optimal sample complexity $m = \Omega(r(d_1 + d_2))$ but are computationally expensive. Non-convex approaches, while more computationally efficient, often require suboptimal sample complexity $m = \Omega(r^2(d_1 + d_2))$. Recent advance achieves $m = \Omega(rd_1)$ for a non-convex approach but relies on the restrictive assumption of positive semidefinite (PSD) matrices and suffers from slow convergence in ill-conditioned settings. Bridging this gap, we show that Riemannian gradient descent (RGD) achieves both optimal sample complexity and computational efficiency without requiring the PSD assumption. Specifically, for Gaussian measurements, RGD exactly recovers the low-rank matrix with $m = \Omega(r(d_1 + d_2))$, matching the information-theoretic lower bound, and converges linearly to the global minimum with an arbitrarily small convergence rate.
APA
Cai, J., Wu, T. & Xia, R.. (2025). Fast Non-convex Matrix Sensing with Optimal Sample Complexity. Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 286:497-520 Available from https://proceedings.mlr.press/v286/cai25b.html.

Related Material