A Dimension-Independent Generalization Bound for Kernel Supervised Principal Component Analysis

Hassan Ashtiani; Ali Ghodsi

A Dimension-Independent Generalization Bound for Kernel Supervised Principal Component Analysis

Hassan Ashtiani, Ali Ghodsi

Proceedings of the 1st International Workshop on Feature Extraction: Modern Questions and Challenges at NIPS 2015, PMLR 44:19-29, 2015.

Abstract

Kernel supervised principal component analysis (KSPCA) is a computationally efficient supervised feature extraction method that can learn non-linear transformations. We start the study of the statistical properties of KSPCA, providing the first bound on its sample complexity. This bound is dimension-independent, which justifies the good performance of KSPCA on high-dimensional data. Another observation is that in the kernelized version, the number of parameters of KSPCA grows linearly with the sample size. While this potentially increases the risk of over-fitting, KSPCA works well in practice. In this work, we justify this compelling characteristic of KSPCA by providing a guarantee indicating that KSPCA generalizes well even when the number of parameters is large, as long as they have small norms.

Cite this Paper

BibTeX


@InProceedings{pmlr-v44-Ashtiani2015,
  title = 	 {A Dimension-Independent Generalization Bound for Kernel Supervised Principal Component Analysis},
  author = 	 {Ashtiani, Hassan and Ghodsi, Ali},
  booktitle = 	 {Proceedings of the 1st International Workshop on Feature Extraction: Modern Questions and Challenges at NIPS 2015},
  pages = 	 {19--29},
  year = 	 {2015},
  editor = 	 {Storcheus, Dmitry and Rostamizadeh, Afshin and Kumar, Sanjiv},
  volume = 	 {44},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Montreal, Canada},
  month = 	 {11 Dec},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v44/Ashtiani2015.pdf},
  url = 	 {https://proceedings.mlr.press/v44/Ashtiani2015.html},
  abstract = 	 {Kernel supervised principal component analysis (KSPCA) is a computationally efficient supervised feature extraction method that can learn non-linear transformations. We start the study of the statistical properties of KSPCA, providing the first bound on its sample complexity. This bound is dimension-independent, which justifies the good performance of KSPCA on high-dimensional data. Another observation is that in the kernelized version, the number of parameters of KSPCA grows linearly with the sample size. While this potentially increases the risk of over-fitting, KSPCA works well in practice. In this work, we justify this compelling characteristic of KSPCA by providing a guarantee indicating that KSPCA generalizes well even when the number of parameters is large, as long as they have small norms.}
}

Endnote

%0 Conference Paper
%T A Dimension-Independent Generalization Bound for Kernel Supervised Principal Component Analysis
%A Hassan Ashtiani
%A Ali Ghodsi
%B Proceedings of the 1st International Workshop on Feature Extraction: Modern Questions and Challenges at NIPS 2015
%C Proceedings of Machine Learning Research
%D 2015
%E Dmitry Storcheus
%E Afshin Rostamizadeh
%E Sanjiv Kumar	
%F pmlr-v44-Ashtiani2015
%I PMLR
%P 19--29
%U https://proceedings.mlr.press/v44/Ashtiani2015.html
%V 44
%X Kernel supervised principal component analysis (KSPCA) is a computationally efficient supervised feature extraction method that can learn non-linear transformations. We start the study of the statistical properties of KSPCA, providing the first bound on its sample complexity. This bound is dimension-independent, which justifies the good performance of KSPCA on high-dimensional data. Another observation is that in the kernelized version, the number of parameters of KSPCA grows linearly with the sample size. While this potentially increases the risk of over-fitting, KSPCA works well in practice. In this work, we justify this compelling characteristic of KSPCA by providing a guarantee indicating that KSPCA generalizes well even when the number of parameters is large, as long as they have small norms.

RIS


TY  - CPAPER
TI  - A Dimension-Independent Generalization Bound for Kernel Supervised Principal Component Analysis
AU  - Hassan Ashtiani
AU  - Ali Ghodsi
BT  - Proceedings of the 1st International Workshop on Feature Extraction: Modern Questions and Challenges at NIPS 2015
DA  - 2015/12/08
ED  - Dmitry Storcheus
ED  - Afshin Rostamizadeh
ED  - Sanjiv Kumar	
ID  - pmlr-v44-Ashtiani2015
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 44
SP  - 19
EP  - 29
L1  - http://proceedings.mlr.press/v44/Ashtiani2015.pdf
UR  - https://proceedings.mlr.press/v44/Ashtiani2015.html
AB  - Kernel supervised principal component analysis (KSPCA) is a computationally efficient supervised feature extraction method that can learn non-linear transformations. We start the study of the statistical properties of KSPCA, providing the first bound on its sample complexity. This bound is dimension-independent, which justifies the good performance of KSPCA on high-dimensional data. Another observation is that in the kernelized version, the number of parameters of KSPCA grows linearly with the sample size. While this potentially increases the risk of over-fitting, KSPCA works well in practice. In this work, we justify this compelling characteristic of KSPCA by providing a guarantee indicating that KSPCA generalizes well even when the number of parameters is large, as long as they have small norms.
ER  -

APA


Ashtiani, H. & Ghodsi, A.. (2015). A Dimension-Independent Generalization Bound for Kernel Supervised Principal Component Analysis. Proceedings of the 1st International Workshop on Feature Extraction: Modern Questions and Challenges at NIPS 2015, in Proceedings of Machine Learning Research 44:19-29 Available from https://proceedings.mlr.press/v44/Ashtiani2015.html.

Related Material

Download PDF