Stay on path: PCA along graph paths

Megasthenis Asteris, Anastasios Kyrillidis, Alex Dimakis, Han-Gyol Yi, Bharath Chandrasekaran
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:1728-1736, 2015.

Abstract

We introduce a variant of (sparse) PCA in which the set of feasible support sets is determined by a graph. In particular, we consider the following setting: given a directed acyclic graph G on p vertices corresponding to variables, the non-zero entries of the extracted principal component must coincide with vertices lying along a path in G. From a statistical perspective, information on the underlying network may potentially reduce the number of observations required to recover the population principal component. We consider the canonical estimator which optimally exploits the prior knowledge by solving a non-convex quadratic maximization on the empirical covariance. We introduce a simple network and analyze the estimator under the spiked covariance model for sparse PCA. We show that side information potentially improves the statistical complexity. We propose two algorithms to approximate the solution of the constrained quadratic maximization, and recover a component with the desired properties. We empirically evaluate our schemes on synthetic and real datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v37-asteris15, title = {Stay on path: PCA along graph paths}, author = {Asteris, Megasthenis and Kyrillidis, Anastasios and Dimakis, Alex and Yi, Han-Gyol and Chandrasekaran, Bharath}, booktitle = {Proceedings of the 32nd International Conference on Machine Learning}, pages = {1728--1736}, year = {2015}, editor = {Bach, Francis and Blei, David}, volume = {37}, series = {Proceedings of Machine Learning Research}, address = {Lille, France}, month = {07--09 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v37/asteris15.pdf}, url = { http://proceedings.mlr.press/v37/asteris15.html }, abstract = {We introduce a variant of (sparse) PCA in which the set of feasible support sets is determined by a graph. In particular, we consider the following setting: given a directed acyclic graph G on p vertices corresponding to variables, the non-zero entries of the extracted principal component must coincide with vertices lying along a path in G. From a statistical perspective, information on the underlying network may potentially reduce the number of observations required to recover the population principal component. We consider the canonical estimator which optimally exploits the prior knowledge by solving a non-convex quadratic maximization on the empirical covariance. We introduce a simple network and analyze the estimator under the spiked covariance model for sparse PCA. We show that side information potentially improves the statistical complexity. We propose two algorithms to approximate the solution of the constrained quadratic maximization, and recover a component with the desired properties. We empirically evaluate our schemes on synthetic and real datasets.} }
Endnote
%0 Conference Paper %T Stay on path: PCA along graph paths %A Megasthenis Asteris %A Anastasios Kyrillidis %A Alex Dimakis %A Han-Gyol Yi %A Bharath Chandrasekaran %B Proceedings of the 32nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2015 %E Francis Bach %E David Blei %F pmlr-v37-asteris15 %I PMLR %P 1728--1736 %U http://proceedings.mlr.press/v37/asteris15.html %V 37 %X We introduce a variant of (sparse) PCA in which the set of feasible support sets is determined by a graph. In particular, we consider the following setting: given a directed acyclic graph G on p vertices corresponding to variables, the non-zero entries of the extracted principal component must coincide with vertices lying along a path in G. From a statistical perspective, information on the underlying network may potentially reduce the number of observations required to recover the population principal component. We consider the canonical estimator which optimally exploits the prior knowledge by solving a non-convex quadratic maximization on the empirical covariance. We introduce a simple network and analyze the estimator under the spiked covariance model for sparse PCA. We show that side information potentially improves the statistical complexity. We propose two algorithms to approximate the solution of the constrained quadratic maximization, and recover a component with the desired properties. We empirically evaluate our schemes on synthetic and real datasets.
RIS
TY - CPAPER TI - Stay on path: PCA along graph paths AU - Megasthenis Asteris AU - Anastasios Kyrillidis AU - Alex Dimakis AU - Han-Gyol Yi AU - Bharath Chandrasekaran BT - Proceedings of the 32nd International Conference on Machine Learning DA - 2015/06/01 ED - Francis Bach ED - David Blei ID - pmlr-v37-asteris15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 37 SP - 1728 EP - 1736 L1 - http://proceedings.mlr.press/v37/asteris15.pdf UR - http://proceedings.mlr.press/v37/asteris15.html AB - We introduce a variant of (sparse) PCA in which the set of feasible support sets is determined by a graph. In particular, we consider the following setting: given a directed acyclic graph G on p vertices corresponding to variables, the non-zero entries of the extracted principal component must coincide with vertices lying along a path in G. From a statistical perspective, information on the underlying network may potentially reduce the number of observations required to recover the population principal component. We consider the canonical estimator which optimally exploits the prior knowledge by solving a non-convex quadratic maximization on the empirical covariance. We introduce a simple network and analyze the estimator under the spiked covariance model for sparse PCA. We show that side information potentially improves the statistical complexity. We propose two algorithms to approximate the solution of the constrained quadratic maximization, and recover a component with the desired properties. We empirically evaluate our schemes on synthetic and real datasets. ER -
APA
Asteris, M., Kyrillidis, A., Dimakis, A., Yi, H. & Chandrasekaran, B.. (2015). Stay on path: PCA along graph paths. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:1728-1736 Available from http://proceedings.mlr.press/v37/asteris15.html .

Related Material