Streaming Kernel Principal Component Analysis

Mina Ghashami; Daniel J. Perry; Jeff Phillips

Streaming Kernel Principal Component Analysis

Mina Ghashami, Daniel J. Perry, Jeff Phillips

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:1365-1374, 2016.

Abstract

Kernel principal component analysis (KPCA) provides a concise set of basis vectors which capture non-linear structures within large data sets, and is a central tool in data analysis and learning. To allow for non-linear relations, typically a full n x n kernel matrix is constructed over n data points, but this requires too much space and time for large values of n. Techniques such as the Nystrom method and random feature maps can help towards this goal, but they do not explicitly maintain the basis vectors in a stream and take more space than desired. We propose a new approach for streaming KPCA which maintains a small set of basis elements in a stream, requiring space only logarithmic in n, and also improves the dependence on the error parameter. Our technique combines together random feature maps with recent advances in matrix sketching, it has guaranteed spectral norm error bounds with respect to the original kernel matrix, and it compares favorably in practice to state-of-the-art approaches.

Cite this Paper

BibTeX


@InProceedings{pmlr-v51-ghashami16,
  title = 	 {Streaming Kernel Principal Component Analysis},
  author = 	 {Ghashami, Mina and Perry, Daniel J. and Phillips, Jeff},
  booktitle = 	 {Proceedings of the 19th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1365--1374},
  year = 	 {2016},
  editor = 	 {Gretton, Arthur and Robert, Christian C.},
  volume = 	 {51},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Cadiz, Spain},
  month = 	 {09--11 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v51/ghashami16.pdf},
  url = 	 {https://proceedings.mlr.press/v51/ghashami16.html},
  abstract = 	 {Kernel principal component analysis (KPCA) provides a concise set of basis vectors which capture non-linear structures within large data sets, and is a central tool in data analysis and learning.  To allow for non-linear relations, typically a full n x n kernel matrix is constructed over n data points, but this requires too much space and time for large values of n.  Techniques such as the Nystrom method and random feature maps can help towards this goal, but they do not explicitly maintain the basis vectors in a stream and take more space than desired.  We propose a new approach for streaming KPCA which maintains a small set of basis elements in a stream, requiring space only logarithmic in n, and also improves the dependence on the error parameter.  Our technique combines together random feature maps with recent advances in matrix sketching, it has guaranteed spectral norm error bounds with respect to the original kernel matrix, and it compares favorably in practice to state-of-the-art approaches.}
}

Endnote

%0 Conference Paper
%T Streaming Kernel Principal Component Analysis
%A Mina Ghashami
%A Daniel J. Perry
%A Jeff Phillips
%B Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2016
%E Arthur Gretton
%E Christian C. Robert	
%F pmlr-v51-ghashami16
%I PMLR
%P 1365--1374
%U https://proceedings.mlr.press/v51/ghashami16.html
%V 51
%X Kernel principal component analysis (KPCA) provides a concise set of basis vectors which capture non-linear structures within large data sets, and is a central tool in data analysis and learning.  To allow for non-linear relations, typically a full n x n kernel matrix is constructed over n data points, but this requires too much space and time for large values of n.  Techniques such as the Nystrom method and random feature maps can help towards this goal, but they do not explicitly maintain the basis vectors in a stream and take more space than desired.  We propose a new approach for streaming KPCA which maintains a small set of basis elements in a stream, requiring space only logarithmic in n, and also improves the dependence on the error parameter.  Our technique combines together random feature maps with recent advances in matrix sketching, it has guaranteed spectral norm error bounds with respect to the original kernel matrix, and it compares favorably in practice to state-of-the-art approaches.

RIS


TY  - CPAPER
TI  - Streaming Kernel Principal Component Analysis
AU  - Mina Ghashami
AU  - Daniel J. Perry
AU  - Jeff Phillips
BT  - Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
DA  - 2016/05/02
ED  - Arthur Gretton
ED  - Christian C. Robert	
ID  - pmlr-v51-ghashami16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 51
SP  - 1365
EP  - 1374
L1  - http://proceedings.mlr.press/v51/ghashami16.pdf
UR  - https://proceedings.mlr.press/v51/ghashami16.html
AB  - Kernel principal component analysis (KPCA) provides a concise set of basis vectors which capture non-linear structures within large data sets, and is a central tool in data analysis and learning.  To allow for non-linear relations, typically a full n x n kernel matrix is constructed over n data points, but this requires too much space and time for large values of n.  Techniques such as the Nystrom method and random feature maps can help towards this goal, but they do not explicitly maintain the basis vectors in a stream and take more space than desired.  We propose a new approach for streaming KPCA which maintains a small set of basis elements in a stream, requiring space only logarithmic in n, and also improves the dependence on the error parameter.  Our technique combines together random feature maps with recent advances in matrix sketching, it has guaranteed spectral norm error bounds with respect to the original kernel matrix, and it compares favorably in practice to state-of-the-art approaches.
ER  -

APA


Ghashami, M., Perry, D.J. & Phillips, J.. (2016). Streaming Kernel Principal Component Analysis. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 51:1365-1374 Available from https://proceedings.mlr.press/v51/ghashami16.html.

Streaming Kernel Principal Component Analysis

Abstract

Cite this Paper

Related Material