On the Curse of Dimensionality in Private Sparse Covariance Estimation and PCA

Syamantak Kumar, Shourya Pandey, Purnamrita Sarkar, Kevin Tian
Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:4349-4400, 2026.

Abstract

We study high-dimensional differentially private (DP) covariance estimation in the operator norm, and principal component analysis (PCA), under $k$-row-column sparsity ($k$-RCS) of the covariance matrix. In the non-private setting, it is known that $\operatorname{poly}(k,\log d)$ samples suffice to solve both of these problems. However, the only comparable result known under DP (Wang and Xu, 2021) requires $\Omega(d)$ samples under standard parameterizations of the problem. We investigate when this curse of dimensionality is inherent for sparse covariance estimation tasks under DP. On the upper bound front, we show that a $\operatorname{poly}(k,\log d)$ sample complexity for PCA is possible under DP, if we also posit sparsity of the leading eigenvector. We complement this result with $\operatorname{poly}(d)$ lower bounds under DP for both sparse covariance estimation and PCA, establishing an exponential gap between the private and non-private variants of these problems when $k=\operatorname{polylog}(d)$. To our knowledge, no such separation has previously been demonstrated for any sparse estimation problem in private high-dimensional statistics. Our techniques are flexible enough that they imply stronger lower bounds even for the well-studied problem of standard DP PCA, without sparsity assumptions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v336-kumar26b, title = {On the Curse of Dimensionality in Private Sparse Covariance Estimation and {PCA}}, author = {Kumar, Syamantak and Pandey, Shourya and Sarkar, Purnamrita and Tian, Kevin}, booktitle = {Proceedings of Thirty Ninth Conference on Learning Theory}, pages = {4349--4400}, year = {2026}, editor = {Hanneke, Steve and Lattimore, Tor}, volume = {336}, series = {Proceedings of Machine Learning Research}, month = {29 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v336/main/assets/kumar26b/kumar26b.pdf}, url = {https://proceedings.mlr.press/v336/kumar26b.html}, abstract = {We study high-dimensional differentially private (DP) covariance estimation in the operator norm, and principal component analysis (PCA), under $k$-row-column sparsity ($k$-RCS) of the covariance matrix. In the non-private setting, it is known that $\operatorname{poly}(k,\log d)$ samples suffice to solve both of these problems. However, the only comparable result known under DP (Wang and Xu, 2021) requires $\Omega(d)$ samples under standard parameterizations of the problem. We investigate when this curse of dimensionality is inherent for sparse covariance estimation tasks under DP. On the upper bound front, we show that a $\operatorname{poly}(k,\log d)$ sample complexity for PCA is possible under DP, if we also posit sparsity of the leading eigenvector. We complement this result with $\operatorname{poly}(d)$ lower bounds under DP for both sparse covariance estimation and PCA, establishing an exponential gap between the private and non-private variants of these problems when $k=\operatorname{polylog}(d)$. To our knowledge, no such separation has previously been demonstrated for any sparse estimation problem in private high-dimensional statistics. Our techniques are flexible enough that they imply stronger lower bounds even for the well-studied problem of standard DP PCA, without sparsity assumptions.} }
Endnote
%0 Conference Paper %T On the Curse of Dimensionality in Private Sparse Covariance Estimation and PCA %A Syamantak Kumar %A Shourya Pandey %A Purnamrita Sarkar %A Kevin Tian %B Proceedings of Thirty Ninth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2026 %E Steve Hanneke %E Tor Lattimore %F pmlr-v336-kumar26b %I PMLR %P 4349--4400 %U https://proceedings.mlr.press/v336/kumar26b.html %V 336 %X We study high-dimensional differentially private (DP) covariance estimation in the operator norm, and principal component analysis (PCA), under $k$-row-column sparsity ($k$-RCS) of the covariance matrix. In the non-private setting, it is known that $\operatorname{poly}(k,\log d)$ samples suffice to solve both of these problems. However, the only comparable result known under DP (Wang and Xu, 2021) requires $\Omega(d)$ samples under standard parameterizations of the problem. We investigate when this curse of dimensionality is inherent for sparse covariance estimation tasks under DP. On the upper bound front, we show that a $\operatorname{poly}(k,\log d)$ sample complexity for PCA is possible under DP, if we also posit sparsity of the leading eigenvector. We complement this result with $\operatorname{poly}(d)$ lower bounds under DP for both sparse covariance estimation and PCA, establishing an exponential gap between the private and non-private variants of these problems when $k=\operatorname{polylog}(d)$. To our knowledge, no such separation has previously been demonstrated for any sparse estimation problem in private high-dimensional statistics. Our techniques are flexible enough that they imply stronger lower bounds even for the well-studied problem of standard DP PCA, without sparsity assumptions.
APA
Kumar, S., Pandey, S., Sarkar, P. & Tian, K.. (2026). On the Curse of Dimensionality in Private Sparse Covariance Estimation and PCA. Proceedings of Thirty Ninth Conference on Learning Theory, in Proceedings of Machine Learning Research 336:4349-4400 Available from https://proceedings.mlr.press/v336/kumar26b.html.

Related Material