Efficient Dimensionality Reduction for High-Dimensional Network Estimation

Safiye Celik; Benjamin Logsdon; Su-In Lee

Efficient Dimensionality Reduction for High-Dimensional Network Estimation

Safiye Celik, Benjamin Logsdon, Su-In Lee

Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):1953-1961, 2014.

Abstract

We propose module graphical lasso (MGL), an aggressive dimensionality reduction and network estimation technique for a high-dimensional Gaussian graphical model (GGM). MGL achieves scalability, interpretability and robustness by exploiting the modularity property of many real-world networks. Variables are organized into tightly coupled modules and a graph structure is estimated to determine the conditional independencies among modules. MGL iteratively learns the module assignment of variables, the latent variables, each corresponding to a module, and the parameters of the GGM of the latent variables. In synthetic data experiments, MGL outperforms the standard graphical lasso and three other methods that incorporate latent variables into GGMs. When applied to gene expression data from ovarian cancer, MGL outperforms standard clustering algorithms in identifying functionally coherent gene sets and predicting survival time of patients. The learned modules and their dependencies provide novel insights into cancer biology as well as identifying possible novel drug targets.

Cite this Paper

BibTeX


@InProceedings{pmlr-v32-celik14,
  title = 	 {Efficient Dimensionality Reduction for High-Dimensional Network Estimation},
  author = 	 {Celik, Safiye and Logsdon, Benjamin and Lee, Su-In},
  booktitle = 	 {Proceedings of the 31st International Conference on Machine Learning},
  pages = 	 {1953--1961},
  year = 	 {2014},
  editor = 	 {Xing, Eric P. and Jebara, Tony},
  volume = 	 {32},
  number =       {2},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Bejing, China},
  month = 	 {22--24 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v32/celik14.pdf},
  url = 	 {https://proceedings.mlr.press/v32/celik14.html},
  abstract = 	 {We propose module graphical lasso (MGL), an aggressive dimensionality reduction and network estimation technique for a high-dimensional Gaussian graphical model (GGM). MGL achieves scalability, interpretability and robustness by exploiting the modularity property of many real-world networks. Variables are organized into tightly coupled modules and a graph structure is estimated to determine the conditional independencies among modules. MGL iteratively learns the module assignment of variables, the latent variables, each corresponding to a module, and the parameters of the GGM of the latent variables. In synthetic data experiments, MGL outperforms the standard graphical lasso and three other methods that incorporate latent variables into GGMs. When applied to gene expression data from ovarian cancer, MGL outperforms standard clustering algorithms in identifying functionally coherent gene sets and predicting survival time of patients. The learned modules and their dependencies provide novel insights into cancer biology as well as identifying possible novel drug targets.}
}

Endnote

%0 Conference Paper
%T Efficient Dimensionality Reduction for High-Dimensional Network Estimation
%A Safiye Celik
%A Benjamin Logsdon
%A Su-In Lee
%B Proceedings of the 31st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2014
%E Eric P. Xing
%E Tony Jebara	
%F pmlr-v32-celik14
%I PMLR
%P 1953--1961
%U https://proceedings.mlr.press/v32/celik14.html
%V 32
%N 2
%X We propose module graphical lasso (MGL), an aggressive dimensionality reduction and network estimation technique for a high-dimensional Gaussian graphical model (GGM). MGL achieves scalability, interpretability and robustness by exploiting the modularity property of many real-world networks. Variables are organized into tightly coupled modules and a graph structure is estimated to determine the conditional independencies among modules. MGL iteratively learns the module assignment of variables, the latent variables, each corresponding to a module, and the parameters of the GGM of the latent variables. In synthetic data experiments, MGL outperforms the standard graphical lasso and three other methods that incorporate latent variables into GGMs. When applied to gene expression data from ovarian cancer, MGL outperforms standard clustering algorithms in identifying functionally coherent gene sets and predicting survival time of patients. The learned modules and their dependencies provide novel insights into cancer biology as well as identifying possible novel drug targets.

RIS


TY  - CPAPER
TI  - Efficient Dimensionality Reduction for High-Dimensional Network Estimation
AU  - Safiye Celik
AU  - Benjamin Logsdon
AU  - Su-In Lee
BT  - Proceedings of the 31st International Conference on Machine Learning
DA  - 2014/06/18
ED  - Eric P. Xing
ED  - Tony Jebara	
ID  - pmlr-v32-celik14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 32
IS  - 2
SP  - 1953
EP  - 1961
L1  - http://proceedings.mlr.press/v32/celik14.pdf
UR  - https://proceedings.mlr.press/v32/celik14.html
AB  - We propose module graphical lasso (MGL), an aggressive dimensionality reduction and network estimation technique for a high-dimensional Gaussian graphical model (GGM). MGL achieves scalability, interpretability and robustness by exploiting the modularity property of many real-world networks. Variables are organized into tightly coupled modules and a graph structure is estimated to determine the conditional independencies among modules. MGL iteratively learns the module assignment of variables, the latent variables, each corresponding to a module, and the parameters of the GGM of the latent variables. In synthetic data experiments, MGL outperforms the standard graphical lasso and three other methods that incorporate latent variables into GGMs. When applied to gene expression data from ovarian cancer, MGL outperforms standard clustering algorithms in identifying functionally coherent gene sets and predicting survival time of patients. The learned modules and their dependencies provide novel insights into cancer biology as well as identifying possible novel drug targets.
ER  -

APA


Celik, S., Logsdon, B. & Lee, S.. (2014). Efficient Dimensionality Reduction for High-Dimensional Network Estimation. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):1953-1961 Available from https://proceedings.mlr.press/v32/celik14.html.

Related Material

Download PDF