Upper bounds for Model-Free Row-Sparse Principal Component Analysis

Guanyi Wang, Santanu Dey
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:9868-9875, 2020.

Abstract

Sparse principal component analysis (PCA) is a widely-used dimensionality reduction tool in statistics and machine learning. Most methods mentioned in literature are either heuristics for good primal feasible solutions under statistical assumptions or ADMM-type algorithms with stationary/critical points convergence property for the regularized reformulation of sparse PCA. However, none of these methods can efficiently verify the quality of the solutions via comparing current objective values with their dual bounds, especially in model-free case. We propose a new framework that finds out upper (dual) bounds for the sparse PCA within polynomial time via solving a convex integer program (IP). We show that, in the worst-case, the dual bounds provided by the convex IP is within an affine function of the global optimal value. Moreover, in contrast to the semi-definition relaxation, this framework is much easier to scale on large cases. Numerical results on both artificial and real cases are reported to demonstrate the advantages of our method.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-wang20e, title = {Upper bounds for Model-Free Row-Sparse Principal Component Analysis}, author = {Wang, Guanyi and Dey, Santanu}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {9868--9875}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/wang20e/wang20e.pdf}, url = {https://proceedings.mlr.press/v119/wang20e.html}, abstract = {Sparse principal component analysis (PCA) is a widely-used dimensionality reduction tool in statistics and machine learning. Most methods mentioned in literature are either heuristics for good primal feasible solutions under statistical assumptions or ADMM-type algorithms with stationary/critical points convergence property for the regularized reformulation of sparse PCA. However, none of these methods can efficiently verify the quality of the solutions via comparing current objective values with their dual bounds, especially in model-free case. We propose a new framework that finds out upper (dual) bounds for the sparse PCA within polynomial time via solving a convex integer program (IP). We show that, in the worst-case, the dual bounds provided by the convex IP is within an affine function of the global optimal value. Moreover, in contrast to the semi-definition relaxation, this framework is much easier to scale on large cases. Numerical results on both artificial and real cases are reported to demonstrate the advantages of our method.} }
Endnote
%0 Conference Paper %T Upper bounds for Model-Free Row-Sparse Principal Component Analysis %A Guanyi Wang %A Santanu Dey %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-wang20e %I PMLR %P 9868--9875 %U https://proceedings.mlr.press/v119/wang20e.html %V 119 %X Sparse principal component analysis (PCA) is a widely-used dimensionality reduction tool in statistics and machine learning. Most methods mentioned in literature are either heuristics for good primal feasible solutions under statistical assumptions or ADMM-type algorithms with stationary/critical points convergence property for the regularized reformulation of sparse PCA. However, none of these methods can efficiently verify the quality of the solutions via comparing current objective values with their dual bounds, especially in model-free case. We propose a new framework that finds out upper (dual) bounds for the sparse PCA within polynomial time via solving a convex integer program (IP). We show that, in the worst-case, the dual bounds provided by the convex IP is within an affine function of the global optimal value. Moreover, in contrast to the semi-definition relaxation, this framework is much easier to scale on large cases. Numerical results on both artificial and real cases are reported to demonstrate the advantages of our method.
APA
Wang, G. & Dey, S.. (2020). Upper bounds for Model-Free Row-Sparse Principal Component Analysis. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:9868-9875 Available from https://proceedings.mlr.press/v119/wang20e.html.

Related Material