Memory and Computation Efficient PCA via Very Sparse Random Projections

Farhad Pourkamali Anaraki; Shannon Hughes

Memory and Computation Efficient PCA via Very Sparse Random Projections

Farhad Pourkamali Anaraki, Shannon Hughes

Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):1341-1349, 2014.

Abstract

Algorithms that can efficiently recover principal components in very high-dimensional, streaming, and/or distributed data settings have become an important topic in the literature. In this paper, we propose an approach to principal component estimation that utilizes projections onto very sparse random vectors with Bernoulli-generated nonzero entries. Indeed, our approach is simultaneously efficient in memory/storage space, efficient in computation, and produces accurate PC estimates, while also allowing for rigorous theoretical performance analysis. Moreover, one can tune the sparsity of the random vectors deliberately to achieve a desired point on the tradeoffs between memory, computation, and accuracy. We rigorously characterize these tradeoffs and provide statistical performance guarantees. In addition to these very sparse random vectors, our analysis also applies to more general random projections. We present experimental results demonstrating that this approach allows for simultaneously achieving a substantial reduction of the computational complexity and memory/storage space, with little loss in accuracy, particularly for very high-dimensional data.

Cite this Paper

BibTeX


@InProceedings{pmlr-v32-anaraki14,
  title = 	 {Memory and Computation Efficient PCA via Very Sparse Random Projections},
  author = 	 {Anaraki, Farhad Pourkamali and Hughes, Shannon},
  booktitle = 	 {Proceedings of the 31st International Conference on Machine Learning},
  pages = 	 {1341--1349},
  year = 	 {2014},
  editor = 	 {Xing, Eric P. and Jebara, Tony},
  volume = 	 {32},
  number =       {2},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Bejing, China},
  month = 	 {22--24 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v32/anaraki14.pdf},
  url = 	 {https://proceedings.mlr.press/v32/anaraki14.html},
  abstract = 	 {Algorithms that can efficiently recover principal components in very high-dimensional, streaming, and/or distributed data settings have become an important topic in the literature. In this paper, we propose an approach to principal component estimation that utilizes projections onto very sparse random vectors with Bernoulli-generated nonzero entries. Indeed, our approach is simultaneously efficient in memory/storage space, efficient in computation, and produces accurate PC estimates, while also allowing for rigorous theoretical performance analysis. Moreover, one can tune the sparsity of the random vectors deliberately to achieve a desired point on the tradeoffs between memory, computation, and accuracy. We rigorously characterize these tradeoffs and provide statistical performance guarantees. In addition to these very sparse random vectors, our analysis also applies to more general random projections. We present experimental results demonstrating that this approach allows for simultaneously achieving a substantial reduction of the computational complexity and memory/storage space, with little loss in accuracy, particularly for very high-dimensional data.}
}

Endnote

%0 Conference Paper
%T Memory and Computation Efficient PCA via Very Sparse Random Projections
%A Farhad Pourkamali Anaraki
%A Shannon Hughes
%B Proceedings of the 31st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2014
%E Eric P. Xing
%E Tony Jebara	
%F pmlr-v32-anaraki14
%I PMLR
%P 1341--1349
%U https://proceedings.mlr.press/v32/anaraki14.html
%V 32
%N 2
%X Algorithms that can efficiently recover principal components in very high-dimensional, streaming, and/or distributed data settings have become an important topic in the literature. In this paper, we propose an approach to principal component estimation that utilizes projections onto very sparse random vectors with Bernoulli-generated nonzero entries. Indeed, our approach is simultaneously efficient in memory/storage space, efficient in computation, and produces accurate PC estimates, while also allowing for rigorous theoretical performance analysis. Moreover, one can tune the sparsity of the random vectors deliberately to achieve a desired point on the tradeoffs between memory, computation, and accuracy. We rigorously characterize these tradeoffs and provide statistical performance guarantees. In addition to these very sparse random vectors, our analysis also applies to more general random projections. We present experimental results demonstrating that this approach allows for simultaneously achieving a substantial reduction of the computational complexity and memory/storage space, with little loss in accuracy, particularly for very high-dimensional data.

RIS


TY  - CPAPER
TI  - Memory and Computation Efficient PCA via Very Sparse Random Projections
AU  - Farhad Pourkamali Anaraki
AU  - Shannon Hughes
BT  - Proceedings of the 31st International Conference on Machine Learning
DA  - 2014/06/18
ED  - Eric P. Xing
ED  - Tony Jebara	
ID  - pmlr-v32-anaraki14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 32
IS  - 2
SP  - 1341
EP  - 1349
L1  - http://proceedings.mlr.press/v32/anaraki14.pdf
UR  - https://proceedings.mlr.press/v32/anaraki14.html
AB  - Algorithms that can efficiently recover principal components in very high-dimensional, streaming, and/or distributed data settings have become an important topic in the literature. In this paper, we propose an approach to principal component estimation that utilizes projections onto very sparse random vectors with Bernoulli-generated nonzero entries. Indeed, our approach is simultaneously efficient in memory/storage space, efficient in computation, and produces accurate PC estimates, while also allowing for rigorous theoretical performance analysis. Moreover, one can tune the sparsity of the random vectors deliberately to achieve a desired point on the tradeoffs between memory, computation, and accuracy. We rigorously characterize these tradeoffs and provide statistical performance guarantees. In addition to these very sparse random vectors, our analysis also applies to more general random projections. We present experimental results demonstrating that this approach allows for simultaneously achieving a substantial reduction of the computational complexity and memory/storage space, with little loss in accuracy, particularly for very high-dimensional data.
ER  -

APA


Anaraki, F.P. & Hughes, S.. (2014). Memory and Computation Efficient PCA via Very Sparse Random Projections. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):1341-1349 Available from https://proceedings.mlr.press/v32/anaraki14.html.

Memory and Computation Efficient PCA via Very Sparse Random Projections

Abstract

Cite this Paper

Related Material