Fast algorithms for sparse principal component analysis based on Rayleigh quotient iteration

Volodymyr Kuleshov
; Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):1418-1425, 2013.

Abstract

We introduce new algorithms for sparse principal component analysis (sPCA), a variation of PCA which aims to represent data in a sparse low-dimensional basis. Our algorithms possess a cubic rate of convergence and can compute principal components with k non-zero elements at a cost of O(nk + k^3) flops per iteration. We observe in numerical experiments that these components are of equal or greater quality than ones obtained from current state-of-the-art techniques, but require between one and two orders of magnitude fewer flops to be computed. Conceptually, our approach generalizes the Rayleigh quotient iteration algorithm for computing eigenvectors, and can be interpreted as a type of second-order optimization method. We demonstrate the applicability of our algorithms on several datasets, including the STL-10 machine vision dataset and gene expression data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v28-kuleshov13, title = {Fast algorithms for sparse principal component analysis based on Rayleigh quotient iteration}, author = {Volodymyr Kuleshov}, booktitle = {Proceedings of the 30th International Conference on Machine Learning}, pages = {1418--1425}, year = {2013}, editor = {Sanjoy Dasgupta and David McAllester}, volume = {28}, number = {3}, series = {Proceedings of Machine Learning Research}, address = {Atlanta, Georgia, USA}, month = {17--19 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v28/kuleshov13.pdf}, url = {http://proceedings.mlr.press/v28/kuleshov13.html}, abstract = {We introduce new algorithms for sparse principal component analysis (sPCA), a variation of PCA which aims to represent data in a sparse low-dimensional basis. Our algorithms possess a cubic rate of convergence and can compute principal components with k non-zero elements at a cost of O(nk + k^3) flops per iteration. We observe in numerical experiments that these components are of equal or greater quality than ones obtained from current state-of-the-art techniques, but require between one and two orders of magnitude fewer flops to be computed. Conceptually, our approach generalizes the Rayleigh quotient iteration algorithm for computing eigenvectors, and can be interpreted as a type of second-order optimization method. We demonstrate the applicability of our algorithms on several datasets, including the STL-10 machine vision dataset and gene expression data.} }
Endnote
%0 Conference Paper %T Fast algorithms for sparse principal component analysis based on Rayleigh quotient iteration %A Volodymyr Kuleshov %B Proceedings of the 30th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2013 %E Sanjoy Dasgupta %E David McAllester %F pmlr-v28-kuleshov13 %I PMLR %J Proceedings of Machine Learning Research %P 1418--1425 %U http://proceedings.mlr.press %V 28 %N 3 %W PMLR %X We introduce new algorithms for sparse principal component analysis (sPCA), a variation of PCA which aims to represent data in a sparse low-dimensional basis. Our algorithms possess a cubic rate of convergence and can compute principal components with k non-zero elements at a cost of O(nk + k^3) flops per iteration. We observe in numerical experiments that these components are of equal or greater quality than ones obtained from current state-of-the-art techniques, but require between one and two orders of magnitude fewer flops to be computed. Conceptually, our approach generalizes the Rayleigh quotient iteration algorithm for computing eigenvectors, and can be interpreted as a type of second-order optimization method. We demonstrate the applicability of our algorithms on several datasets, including the STL-10 machine vision dataset and gene expression data.
RIS
TY - CPAPER TI - Fast algorithms for sparse principal component analysis based on Rayleigh quotient iteration AU - Volodymyr Kuleshov BT - Proceedings of the 30th International Conference on Machine Learning PY - 2013/02/13 DA - 2013/02/13 ED - Sanjoy Dasgupta ED - David McAllester ID - pmlr-v28-kuleshov13 PB - PMLR SP - 1418 DP - PMLR EP - 1425 L1 - http://proceedings.mlr.press/v28/kuleshov13.pdf UR - http://proceedings.mlr.press/v28/kuleshov13.html AB - We introduce new algorithms for sparse principal component analysis (sPCA), a variation of PCA which aims to represent data in a sparse low-dimensional basis. Our algorithms possess a cubic rate of convergence and can compute principal components with k non-zero elements at a cost of O(nk + k^3) flops per iteration. We observe in numerical experiments that these components are of equal or greater quality than ones obtained from current state-of-the-art techniques, but require between one and two orders of magnitude fewer flops to be computed. Conceptually, our approach generalizes the Rayleigh quotient iteration algorithm for computing eigenvectors, and can be interpreted as a type of second-order optimization method. We demonstrate the applicability of our algorithms on several datasets, including the STL-10 machine vision dataset and gene expression data. ER -
APA
Kuleshov, V.. (2013). Fast algorithms for sparse principal component analysis based on Rayleigh quotient iteration. Proceedings of the 30th International Conference on Machine Learning, in PMLR 28(3):1418-1425

Related Material