Online PCA with Spectral Bounds

Zohar Karnin, Edo Liberty
; Proceedings of The 28th Conference on Learning Theory, PMLR 40:1129-1140, 2015.

Abstract

This paper revisits the online PCA problem. Given a stream of n vectors x_t ∈\mathbbR^d (columns of X) the algorithm must output y_t ∈\mathbbR^\ell (columns of Y) before receiving x_t+1. The goal of online PCA is to simultaneously minimize the target dimension \ell and the error \|X - (XY^\scriptstyle \textrm +)Y\|^2. We describe two simple and deterministic algorithms. The first, receives a parameter ∆and guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 is not significantly larger than ∆. It requires a target dimension of \ell = O(k/ε) for any k,εsuch that ∆\ge ε\sigma_1^2 + \sigma_k+1^2, with \sigma_i being the i’th singular value of X. The second receives k and εand guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 \le ε\sigma_1^2 + \sigma_k+1^2. It requires a target dimension of O( k\log n/ε^2). Different models and algorithms for Online PCA were considered in the past. This is the first that achieves a bound on the spectral norm of the residual matrix.

Cite this Paper


BibTeX
@InProceedings{pmlr-v40-Karnin15, title = {Online {PCA} with Spectral Bounds}, author = {Zohar Karnin and Edo Liberty}, booktitle = {Proceedings of The 28th Conference on Learning Theory}, pages = {1129--1140}, year = {2015}, editor = {Peter Grünwald and Elad Hazan and Satyen Kale}, volume = {40}, series = {Proceedings of Machine Learning Research}, address = {Paris, France}, month = {03--06 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v40/Karnin15.pdf}, url = {http://proceedings.mlr.press/v40/Karnin15.html}, abstract = {This paper revisits the online PCA problem. Given a stream of n vectors x_t ∈\mathbbR^d (columns of X) the algorithm must output y_t ∈\mathbbR^\ell (columns of Y) before receiving x_t+1. The goal of online PCA is to simultaneously minimize the target dimension \ell and the error \|X - (XY^\scriptstyle \textrm +)Y\|^2. We describe two simple and deterministic algorithms. The first, receives a parameter ∆and guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 is not significantly larger than ∆. It requires a target dimension of \ell = O(k/ε) for any k,εsuch that ∆\ge ε\sigma_1^2 + \sigma_k+1^2, with \sigma_i being the i’th singular value of X. The second receives k and εand guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 \le ε\sigma_1^2 + \sigma_k+1^2. It requires a target dimension of O( k\log n/ε^2). Different models and algorithms for Online PCA were considered in the past. This is the first that achieves a bound on the spectral norm of the residual matrix. } }
Endnote
%0 Conference Paper %T Online PCA with Spectral Bounds %A Zohar Karnin %A Edo Liberty %B Proceedings of The 28th Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2015 %E Peter Grünwald %E Elad Hazan %E Satyen Kale %F pmlr-v40-Karnin15 %I PMLR %J Proceedings of Machine Learning Research %P 1129--1140 %U http://proceedings.mlr.press %V 40 %W PMLR %X This paper revisits the online PCA problem. Given a stream of n vectors x_t ∈\mathbbR^d (columns of X) the algorithm must output y_t ∈\mathbbR^\ell (columns of Y) before receiving x_t+1. The goal of online PCA is to simultaneously minimize the target dimension \ell and the error \|X - (XY^\scriptstyle \textrm +)Y\|^2. We describe two simple and deterministic algorithms. The first, receives a parameter ∆and guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 is not significantly larger than ∆. It requires a target dimension of \ell = O(k/ε) for any k,εsuch that ∆\ge ε\sigma_1^2 + \sigma_k+1^2, with \sigma_i being the i’th singular value of X. The second receives k and εand guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 \le ε\sigma_1^2 + \sigma_k+1^2. It requires a target dimension of O( k\log n/ε^2). Different models and algorithms for Online PCA were considered in the past. This is the first that achieves a bound on the spectral norm of the residual matrix.
RIS
TY - CPAPER TI - Online PCA with Spectral Bounds AU - Zohar Karnin AU - Edo Liberty BT - Proceedings of The 28th Conference on Learning Theory PY - 2015/06/26 DA - 2015/06/26 ED - Peter Grünwald ED - Elad Hazan ED - Satyen Kale ID - pmlr-v40-Karnin15 PB - PMLR SP - 1129 DP - PMLR EP - 1140 L1 - http://proceedings.mlr.press/v40/Karnin15.pdf UR - http://proceedings.mlr.press/v40/Karnin15.html AB - This paper revisits the online PCA problem. Given a stream of n vectors x_t ∈\mathbbR^d (columns of X) the algorithm must output y_t ∈\mathbbR^\ell (columns of Y) before receiving x_t+1. The goal of online PCA is to simultaneously minimize the target dimension \ell and the error \|X - (XY^\scriptstyle \textrm +)Y\|^2. We describe two simple and deterministic algorithms. The first, receives a parameter ∆and guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 is not significantly larger than ∆. It requires a target dimension of \ell = O(k/ε) for any k,εsuch that ∆\ge ε\sigma_1^2 + \sigma_k+1^2, with \sigma_i being the i’th singular value of X. The second receives k and εand guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 \le ε\sigma_1^2 + \sigma_k+1^2. It requires a target dimension of O( k\log n/ε^2). Different models and algorithms for Online PCA were considered in the past. This is the first that achieves a bound on the spectral norm of the residual matrix. ER -
APA
Karnin, Z. & Liberty, E.. (2015). Online PCA with Spectral Bounds. Proceedings of The 28th Conference on Learning Theory, in PMLR 40:1129-1140

Related Material