Online PCA with Spectral Bounds


Zohar Karnin, Edo Liberty ;
Proceedings of The 28th Conference on Learning Theory, PMLR 40:1129-1140, 2015.


This paper revisits the online PCA problem. Given a stream of n vectors x_t ∈\mathbbR^d (columns of X) the algorithm must output y_t ∈\mathbbR^\ell (columns of Y) before receiving x_t+1. The goal of online PCA is to simultaneously minimize the target dimension \ell and the error \|X - (XY^\scriptstyle \textrm +)Y\|^2. We describe two simple and deterministic algorithms. The first, receives a parameter ∆and guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 is not significantly larger than ∆. It requires a target dimension of \ell = O(k/ε) for any k,εsuch that ∆\ge ε\sigma_1^2 + \sigma_k+1^2, with \sigma_i being the i’th singular value of X. The second receives k and εand guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 \le ε\sigma_1^2 + \sigma_k+1^2. It requires a target dimension of O( k\log n/ε^2). Different models and algorithms for Online PCA were considered in the past. This is the first that achieves a bound on the spectral norm of the residual matrix.

Related Material