Fast algorithms for sparse principal component analysis based on Rayleigh quotient iteration


Volodymyr Kuleshov ;
Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):1418-1425, 2013.


We introduce new algorithms for sparse principal component analysis (sPCA), a variation of PCA which aims to represent data in a sparse low-dimensional basis. Our algorithms possess a cubic rate of convergence and can compute principal components with k non-zero elements at a cost of O(nk + k^3) flops per iteration. We observe in numerical experiments that these components are of equal or greater quality than ones obtained from current state-of-the-art techniques, but require between one and two orders of magnitude fewer flops to be computed. Conceptually, our approach generalizes the Rayleigh quotient iteration algorithm for computing eigenvectors, and can be interpreted as a type of second-order optimization method. We demonstrate the applicability of our algorithms on several datasets, including the STL-10 machine vision dataset and gene expression data.

Related Material