Sparse Probabilistic Principal Component Analysis

Yue Guan, Jennifer Dy
Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, PMLR 5:185-192, 2009.

Abstract

Principal component analysis (PCA) is a popular dimensionality reduction algorithm. However, it is not easy to interpret which of the original features are important based on the principal components. Recent methods improve interpretability by sparsifying PCA through adding an $L_1$ regularizer. In this paper, we introduce a probabilistic formulation for sparse PCA. By presenting sparse PCA as a probabilistic Bayesian formulation, we gain the benefit of automatic model selection. We examine three different priors for achieving sparsification: (1) a two-level hierarchical prior equivalent to a Laplacian distribution and consequently to an $L_1$ regularization, (2) an inverse-Gaussian prior, and (3) a Jeffrey’s prior. We learn these models by applying variational inference. Our experiments verify that indeed our sparse probabilistic model results in a sparse PCA solution.

Cite this Paper


BibTeX
@InProceedings{pmlr-v5-guan09a, title = {Sparse Probabilistic Principal Component Analysis}, author = {Guan, Yue and Dy, Jennifer}, booktitle = {Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics}, pages = {185--192}, year = {2009}, editor = {van Dyk, David and Welling, Max}, volume = {5}, series = {Proceedings of Machine Learning Research}, address = {Hilton Clearwater Beach Resort, Clearwater Beach, Florida USA}, month = {16--18 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v5/guan09a/guan09a.pdf}, url = {https://proceedings.mlr.press/v5/guan09a.html}, abstract = {Principal component analysis (PCA) is a popular dimensionality reduction algorithm. However, it is not easy to interpret which of the original features are important based on the principal components. Recent methods improve interpretability by sparsifying PCA through adding an $L_1$ regularizer. In this paper, we introduce a probabilistic formulation for sparse PCA. By presenting sparse PCA as a probabilistic Bayesian formulation, we gain the benefit of automatic model selection. We examine three different priors for achieving sparsification: (1) a two-level hierarchical prior equivalent to a Laplacian distribution and consequently to an $L_1$ regularization, (2) an inverse-Gaussian prior, and (3) a Jeffrey’s prior. We learn these models by applying variational inference. Our experiments verify that indeed our sparse probabilistic model results in a sparse PCA solution.} }
Endnote
%0 Conference Paper %T Sparse Probabilistic Principal Component Analysis %A Yue Guan %A Jennifer Dy %B Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2009 %E David van Dyk %E Max Welling %F pmlr-v5-guan09a %I PMLR %P 185--192 %U https://proceedings.mlr.press/v5/guan09a.html %V 5 %X Principal component analysis (PCA) is a popular dimensionality reduction algorithm. However, it is not easy to interpret which of the original features are important based on the principal components. Recent methods improve interpretability by sparsifying PCA through adding an $L_1$ regularizer. In this paper, we introduce a probabilistic formulation for sparse PCA. By presenting sparse PCA as a probabilistic Bayesian formulation, we gain the benefit of automatic model selection. We examine three different priors for achieving sparsification: (1) a two-level hierarchical prior equivalent to a Laplacian distribution and consequently to an $L_1$ regularization, (2) an inverse-Gaussian prior, and (3) a Jeffrey’s prior. We learn these models by applying variational inference. Our experiments verify that indeed our sparse probabilistic model results in a sparse PCA solution.
RIS
TY - CPAPER TI - Sparse Probabilistic Principal Component Analysis AU - Yue Guan AU - Jennifer Dy BT - Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics DA - 2009/04/15 ED - David van Dyk ED - Max Welling ID - pmlr-v5-guan09a PB - PMLR DP - Proceedings of Machine Learning Research VL - 5 SP - 185 EP - 192 L1 - http://proceedings.mlr.press/v5/guan09a/guan09a.pdf UR - https://proceedings.mlr.press/v5/guan09a.html AB - Principal component analysis (PCA) is a popular dimensionality reduction algorithm. However, it is not easy to interpret which of the original features are important based on the principal components. Recent methods improve interpretability by sparsifying PCA through adding an $L_1$ regularizer. In this paper, we introduce a probabilistic formulation for sparse PCA. By presenting sparse PCA as a probabilistic Bayesian formulation, we gain the benefit of automatic model selection. We examine three different priors for achieving sparsification: (1) a two-level hierarchical prior equivalent to a Laplacian distribution and consequently to an $L_1$ regularization, (2) an inverse-Gaussian prior, and (3) a Jeffrey’s prior. We learn these models by applying variational inference. Our experiments verify that indeed our sparse probabilistic model results in a sparse PCA solution. ER -
APA
Guan, Y. & Dy, J.. (2009). Sparse Probabilistic Principal Component Analysis. Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 5:185-192 Available from https://proceedings.mlr.press/v5/guan09a.html.

Related Material