Beyond Sin-Squared Error: Linear Time Entrywise Uncertainty Quantification for Streaming PCA

Syamantak Kumar, Shourya Pandey, Purnamrita Sarkar
Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, PMLR 286:2396-2430, 2025.

Abstract

We propose a novel statistical inference framework for streaming principal component analysis (PCA) using Oja’s algorithm, enabling the construction of confidence intervals for individual entries of the estimated eigenvector. Most existing works on streaming PCA focus on providing sharp sin-squared error guarantees. Recently, there has been some interest in uncertainty quantification for the sin-squared error. However, uncertainty quantification or sharp error guarantees for _entries of the estimated eigenvector_ in the streaming setting remains largely unexplored. We derive a sharp Bernstein-type concentration bound for elements of the estimated vector matching the optimal error rate up to logarithmic factors. We also establish a Central Limit Theorem for a suitably centered and scaled subset of the entries. To efficiently estimate the coordinate-wise variance, we introduce a provably consistent subsampling algorithm that leverages the median-of-means approach, empirically achieving similar accuracy to multiplier bootstrap methods while being significantly more computationally efficient. Numerical experiments demonstrate its effectiveness in providing reliable uncertainty estimates with a fraction of the computational cost of existing methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v286-kumar25a, title = {Beyond Sin-Squared Error: Linear Time Entrywise Uncertainty Quantification for Streaming PCA}, author = {Kumar, Syamantak and Pandey, Shourya and Sarkar, Purnamrita}, booktitle = {Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence}, pages = {2396--2430}, year = {2025}, editor = {Chiappa, Silvia and Magliacane, Sara}, volume = {286}, series = {Proceedings of Machine Learning Research}, month = {21--25 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v286/main/assets/kumar25a/kumar25a.pdf}, url = {https://proceedings.mlr.press/v286/kumar25a.html}, abstract = {We propose a novel statistical inference framework for streaming principal component analysis (PCA) using Oja’s algorithm, enabling the construction of confidence intervals for individual entries of the estimated eigenvector. Most existing works on streaming PCA focus on providing sharp sin-squared error guarantees. Recently, there has been some interest in uncertainty quantification for the sin-squared error. However, uncertainty quantification or sharp error guarantees for _entries of the estimated eigenvector_ in the streaming setting remains largely unexplored. We derive a sharp Bernstein-type concentration bound for elements of the estimated vector matching the optimal error rate up to logarithmic factors. We also establish a Central Limit Theorem for a suitably centered and scaled subset of the entries. To efficiently estimate the coordinate-wise variance, we introduce a provably consistent subsampling algorithm that leverages the median-of-means approach, empirically achieving similar accuracy to multiplier bootstrap methods while being significantly more computationally efficient. Numerical experiments demonstrate its effectiveness in providing reliable uncertainty estimates with a fraction of the computational cost of existing methods.} }
Endnote
%0 Conference Paper %T Beyond Sin-Squared Error: Linear Time Entrywise Uncertainty Quantification for Streaming PCA %A Syamantak Kumar %A Shourya Pandey %A Purnamrita Sarkar %B Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2025 %E Silvia Chiappa %E Sara Magliacane %F pmlr-v286-kumar25a %I PMLR %P 2396--2430 %U https://proceedings.mlr.press/v286/kumar25a.html %V 286 %X We propose a novel statistical inference framework for streaming principal component analysis (PCA) using Oja’s algorithm, enabling the construction of confidence intervals for individual entries of the estimated eigenvector. Most existing works on streaming PCA focus on providing sharp sin-squared error guarantees. Recently, there has been some interest in uncertainty quantification for the sin-squared error. However, uncertainty quantification or sharp error guarantees for _entries of the estimated eigenvector_ in the streaming setting remains largely unexplored. We derive a sharp Bernstein-type concentration bound for elements of the estimated vector matching the optimal error rate up to logarithmic factors. We also establish a Central Limit Theorem for a suitably centered and scaled subset of the entries. To efficiently estimate the coordinate-wise variance, we introduce a provably consistent subsampling algorithm that leverages the median-of-means approach, empirically achieving similar accuracy to multiplier bootstrap methods while being significantly more computationally efficient. Numerical experiments demonstrate its effectiveness in providing reliable uncertainty estimates with a fraction of the computational cost of existing methods.
APA
Kumar, S., Pandey, S. & Sarkar, P.. (2025). Beyond Sin-Squared Error: Linear Time Entrywise Uncertainty Quantification for Streaming PCA. Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 286:2396-2430 Available from https://proceedings.mlr.press/v286/kumar25a.html.

Related Material