Sparse PCA through Low-rank Approximations

Dimitris Papailiopoulos; Alexandros Dimakis; Stavros Korokythakis

Sparse PCA through Low-rank Approximations

Dimitris Papailiopoulos, Alexandros Dimakis, Stavros Korokythakis

Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):747-755, 2013.

Abstract

We introduce a novel algorithm that computes the k-sparse principal component of a positive semidefinite matrix A. Our algorithm is combinatorial and operates by examining a discrete set of special vectors lying in a low-dimensional eigen-subspace of A. We obtain provable approximation guarantees that depend on the spectral profile of the matrix: the faster the eigenvalue decay, the better the quality of our approximation. For example, if the eigenvalues of A follow a power-law decay, we obtain a polynomial-time approximation algorithm for any desired accuracy. We implement our algorithm and test it on multiple artificial and real data sets. Due to a feature elimination step, it is possible to perform sparse PCA on data sets consisting of millions of entries in a few minutes. Our experimental evaluation shows that our scheme is nearly optimal while finding very sparse vectors. We compare to the prior state of the art and show that our scheme matches or outperforms previous algorithms in all tested data sets.

Cite this Paper

BibTeX


@InProceedings{pmlr-v28-papailiopoulos13,
  title = 	 {Sparse PCA through Low-rank Approximations},
  author = 	 {Papailiopoulos, Dimitris and Dimakis, Alexandros and Korokythakis, Stavros},
  booktitle = 	 {Proceedings of the 30th International Conference on Machine Learning},
  pages = 	 {747--755},
  year = 	 {2013},
  editor = 	 {Dasgupta, Sanjoy and McAllester, David},
  volume = 	 {28},
  number =       {3},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Atlanta, Georgia, USA},
  month = 	 {17--19 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v28/papailiopoulos13.pdf},
  url = 	 {https://proceedings.mlr.press/v28/papailiopoulos13.html},
  abstract = 	 {We introduce a novel algorithm that computes the k-sparse principal component of a positive semidefinite matrix A.  Our algorithm is combinatorial and operates by examining a discrete set of special vectors lying in a low-dimensional eigen-subspace of A.  We obtain provable approximation guarantees that depend on the spectral profile of the matrix: the faster the eigenvalue decay, the better the quality of our approximation.  For example, if the eigenvalues of A follow a power-law decay, we obtain a polynomial-time approximation   algorithm for any desired accuracy.   We implement our algorithm and test it on multiple artificial and real data sets. Due to   a feature elimination step, it is possible to perform sparse PCA on data sets consisting of millions of entries in a few minutes.   Our experimental evaluation shows that our scheme is nearly optimal   while finding very sparse vectors.  We compare to the prior state of the art and show that our scheme matches or outperforms previous algorithms   in all tested data sets. }
}

Endnote

%0 Conference Paper
%T Sparse PCA through Low-rank Approximations
%A Dimitris Papailiopoulos
%A Alexandros Dimakis
%A Stavros Korokythakis
%B Proceedings of the 30th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Sanjoy Dasgupta
%E David McAllester	
%F pmlr-v28-papailiopoulos13
%I PMLR
%P 747--755
%U https://proceedings.mlr.press/v28/papailiopoulos13.html
%V 28
%N 3
%X We introduce a novel algorithm that computes the k-sparse principal component of a positive semidefinite matrix A.  Our algorithm is combinatorial and operates by examining a discrete set of special vectors lying in a low-dimensional eigen-subspace of A.  We obtain provable approximation guarantees that depend on the spectral profile of the matrix: the faster the eigenvalue decay, the better the quality of our approximation.  For example, if the eigenvalues of A follow a power-law decay, we obtain a polynomial-time approximation   algorithm for any desired accuracy.   We implement our algorithm and test it on multiple artificial and real data sets. Due to   a feature elimination step, it is possible to perform sparse PCA on data sets consisting of millions of entries in a few minutes.   Our experimental evaluation shows that our scheme is nearly optimal   while finding very sparse vectors.  We compare to the prior state of the art and show that our scheme matches or outperforms previous algorithms   in all tested data sets.

RIS


TY  - CPAPER
TI  - Sparse PCA through Low-rank Approximations
AU  - Dimitris Papailiopoulos
AU  - Alexandros Dimakis
AU  - Stavros Korokythakis
BT  - Proceedings of the 30th International Conference on Machine Learning
DA  - 2013/05/26
ED  - Sanjoy Dasgupta
ED  - David McAllester	
ID  - pmlr-v28-papailiopoulos13
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 28
IS  - 3
SP  - 747
EP  - 755
L1  - http://proceedings.mlr.press/v28/papailiopoulos13.pdf
UR  - https://proceedings.mlr.press/v28/papailiopoulos13.html
AB  - We introduce a novel algorithm that computes the k-sparse principal component of a positive semidefinite matrix A.  Our algorithm is combinatorial and operates by examining a discrete set of special vectors lying in a low-dimensional eigen-subspace of A.  We obtain provable approximation guarantees that depend on the spectral profile of the matrix: the faster the eigenvalue decay, the better the quality of our approximation.  For example, if the eigenvalues of A follow a power-law decay, we obtain a polynomial-time approximation   algorithm for any desired accuracy.   We implement our algorithm and test it on multiple artificial and real data sets. Due to   a feature elimination step, it is possible to perform sparse PCA on data sets consisting of millions of entries in a few minutes.   Our experimental evaluation shows that our scheme is nearly optimal   while finding very sparse vectors.  We compare to the prior state of the art and show that our scheme matches or outperforms previous algorithms   in all tested data sets. 
ER  -

APA


Papailiopoulos, D., Dimakis, A. & Korokythakis, S.. (2013). Sparse PCA through Low-rank Approximations. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):747-755 Available from https://proceedings.mlr.press/v28/papailiopoulos13.html.

Related Material

Download PDF