Streaming k-PCA: Efficient guarantees for Oja’s algorithm, beyond rank-one updates

De Huang, Jonathan Niles-Weed, Rachel Ward
Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:2463-2498, 2021.

Abstract

We analyze Oja’s algorithm for streaming $k$-PCA, and prove that it achieves performance nearly matching that of an optimal offline algorithm. Given access to a sequence of i.i.d. $d \times d$ symmetric matrices, we show that Oja’s algorithm can obtain an accurate approximation to the subspace of the top $k$ eigenvectors of their expectation using a number of samples that scales polylogarithmically with $d$. Previously, such a result was only known in the case where the updates have rank one. Our analysis is based on recently developed matrix concentration tools, which allow us to prove strong bounds on the tails of the random matrices which arise in the course of the algorithm’s execution.

Cite this Paper


BibTeX
@InProceedings{pmlr-v134-huang21a, title = {Streaming k-PCA: Efficient guarantees for Oja’s algorithm, beyond rank-one updates}, author = {Huang, De and Niles-Weed, Jonathan and Ward, Rachel}, booktitle = {Proceedings of Thirty Fourth Conference on Learning Theory}, pages = {2463--2498}, year = {2021}, editor = {Belkin, Mikhail and Kpotufe, Samory}, volume = {134}, series = {Proceedings of Machine Learning Research}, month = {15--19 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v134/huang21a/huang21a.pdf}, url = {https://proceedings.mlr.press/v134/huang21a.html}, abstract = {We analyze Oja’s algorithm for streaming $k$-PCA, and prove that it achieves performance nearly matching that of an optimal offline algorithm. Given access to a sequence of i.i.d. $d \times d$ symmetric matrices, we show that Oja’s algorithm can obtain an accurate approximation to the subspace of the top $k$ eigenvectors of their expectation using a number of samples that scales polylogarithmically with $d$. Previously, such a result was only known in the case where the updates have rank one. Our analysis is based on recently developed matrix concentration tools, which allow us to prove strong bounds on the tails of the random matrices which arise in the course of the algorithm’s execution.} }
Endnote
%0 Conference Paper %T Streaming k-PCA: Efficient guarantees for Oja’s algorithm, beyond rank-one updates %A De Huang %A Jonathan Niles-Weed %A Rachel Ward %B Proceedings of Thirty Fourth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2021 %E Mikhail Belkin %E Samory Kpotufe %F pmlr-v134-huang21a %I PMLR %P 2463--2498 %U https://proceedings.mlr.press/v134/huang21a.html %V 134 %X We analyze Oja’s algorithm for streaming $k$-PCA, and prove that it achieves performance nearly matching that of an optimal offline algorithm. Given access to a sequence of i.i.d. $d \times d$ symmetric matrices, we show that Oja’s algorithm can obtain an accurate approximation to the subspace of the top $k$ eigenvectors of their expectation using a number of samples that scales polylogarithmically with $d$. Previously, such a result was only known in the case where the updates have rank one. Our analysis is based on recently developed matrix concentration tools, which allow us to prove strong bounds on the tails of the random matrices which arise in the course of the algorithm’s execution.
APA
Huang, D., Niles-Weed, J. & Ward, R.. (2021). Streaming k-PCA: Efficient guarantees for Oja’s algorithm, beyond rank-one updates. Proceedings of Thirty Fourth Conference on Learning Theory, in Proceedings of Machine Learning Research 134:2463-2498 Available from https://proceedings.mlr.press/v134/huang21a.html.

Related Material