A Dimension-Independent Generalization Bound for Kernel Supervised Principal Component Analysis

Hassan Ashtiani, Ali Ghodsi
Proceedings of the 1st International Workshop on Feature Extraction: Modern Questions and Challenges at NIPS 2015, PMLR 44:19-29, 2015.

Abstract

Kernel supervised principal component analysis (KSPCA) is a computationally efficient supervised feature extraction method that can learn non-linear transformations. We start the study of the statistical properties of KSPCA, providing the first bound on its sample complexity. This bound is dimension-independent, which justifies the good performance of KSPCA on high-dimensional data. Another observation is that in the kernelized version, the number of parameters of KSPCA grows linearly with the sample size. While this potentially increases the risk of over-fitting, KSPCA works well in practice. In this work, we justify this compelling characteristic of KSPCA by providing a guarantee indicating that KSPCA generalizes well even when the number of parameters is large, as long as they have small norms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v44-Ashtiani2015, title = {A Dimension-Independent Generalization Bound for Kernel Supervised Principal Component Analysis}, author = {Ashtiani, Hassan and Ghodsi, Ali}, booktitle = {Proceedings of the 1st International Workshop on Feature Extraction: Modern Questions and Challenges at NIPS 2015}, pages = {19--29}, year = {2015}, editor = {Storcheus, Dmitry and Rostamizadeh, Afshin and Kumar, Sanjiv}, volume = {44}, series = {Proceedings of Machine Learning Research}, address = {Montreal, Canada}, month = {11 Dec}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v44/Ashtiani2015.pdf}, url = {https://proceedings.mlr.press/v44/Ashtiani2015.html}, abstract = {Kernel supervised principal component analysis (KSPCA) is a computationally efficient supervised feature extraction method that can learn non-linear transformations. We start the study of the statistical properties of KSPCA, providing the first bound on its sample complexity. This bound is dimension-independent, which justifies the good performance of KSPCA on high-dimensional data. Another observation is that in the kernelized version, the number of parameters of KSPCA grows linearly with the sample size. While this potentially increases the risk of over-fitting, KSPCA works well in practice. In this work, we justify this compelling characteristic of KSPCA by providing a guarantee indicating that KSPCA generalizes well even when the number of parameters is large, as long as they have small norms.} }
Endnote
%0 Conference Paper %T A Dimension-Independent Generalization Bound for Kernel Supervised Principal Component Analysis %A Hassan Ashtiani %A Ali Ghodsi %B Proceedings of the 1st International Workshop on Feature Extraction: Modern Questions and Challenges at NIPS 2015 %C Proceedings of Machine Learning Research %D 2015 %E Dmitry Storcheus %E Afshin Rostamizadeh %E Sanjiv Kumar %F pmlr-v44-Ashtiani2015 %I PMLR %P 19--29 %U https://proceedings.mlr.press/v44/Ashtiani2015.html %V 44 %X Kernel supervised principal component analysis (KSPCA) is a computationally efficient supervised feature extraction method that can learn non-linear transformations. We start the study of the statistical properties of KSPCA, providing the first bound on its sample complexity. This bound is dimension-independent, which justifies the good performance of KSPCA on high-dimensional data. Another observation is that in the kernelized version, the number of parameters of KSPCA grows linearly with the sample size. While this potentially increases the risk of over-fitting, KSPCA works well in practice. In this work, we justify this compelling characteristic of KSPCA by providing a guarantee indicating that KSPCA generalizes well even when the number of parameters is large, as long as they have small norms.
RIS
TY - CPAPER TI - A Dimension-Independent Generalization Bound for Kernel Supervised Principal Component Analysis AU - Hassan Ashtiani AU - Ali Ghodsi BT - Proceedings of the 1st International Workshop on Feature Extraction: Modern Questions and Challenges at NIPS 2015 DA - 2015/12/08 ED - Dmitry Storcheus ED - Afshin Rostamizadeh ED - Sanjiv Kumar ID - pmlr-v44-Ashtiani2015 PB - PMLR DP - Proceedings of Machine Learning Research VL - 44 SP - 19 EP - 29 L1 - http://proceedings.mlr.press/v44/Ashtiani2015.pdf UR - https://proceedings.mlr.press/v44/Ashtiani2015.html AB - Kernel supervised principal component analysis (KSPCA) is a computationally efficient supervised feature extraction method that can learn non-linear transformations. We start the study of the statistical properties of KSPCA, providing the first bound on its sample complexity. This bound is dimension-independent, which justifies the good performance of KSPCA on high-dimensional data. Another observation is that in the kernelized version, the number of parameters of KSPCA grows linearly with the sample size. While this potentially increases the risk of over-fitting, KSPCA works well in practice. In this work, we justify this compelling characteristic of KSPCA by providing a guarantee indicating that KSPCA generalizes well even when the number of parameters is large, as long as they have small norms. ER -
APA
Ashtiani, H. & Ghodsi, A.. (2015). A Dimension-Independent Generalization Bound for Kernel Supervised Principal Component Analysis. Proceedings of the 1st International Workshop on Feature Extraction: Modern Questions and Challenges at NIPS 2015, in Proceedings of Machine Learning Research 44:19-29 Available from https://proceedings.mlr.press/v44/Ashtiani2015.html.

Related Material