Robust Gaussian Graphical Model Estimation with Arbitrary Corruption

Lingxiao Wang; Quanquan Gu

Robust Gaussian Graphical Model Estimation with Arbitrary Corruption

Lingxiao Wang, Quanquan Gu

Proceedings of the 34th International Conference on Machine Learning, PMLR 70:3617-3626, 2017.

Abstract

We study the problem of estimating the high-dimensional Gaussian graphical model where the data are arbitrarily corrupted. We propose a robust estimator for the sparse precision matrix in the high-dimensional regime. At the core of our method is a robust covariance matrix estimator, which is based on truncated inner product. We establish the statistical guarantee of our estimator on both estimation error and model selection consistency. In particular, we show that provided that the number of corrupted samples $n_2$ for each variable satisfies $n_2 \lesssim \sqrt{n}/\sqrt{\log d}$, where $n$ is the sample size and $d$ is the number of variables, the proposed robust precision matrix estimator attains the same statistical rate as the standard estimator for Gaussian graphical models. In addition, we propose a hypothesis testing procedure to assess the uncertainty of our robust estimator. We demonstrate the effectiveness of our method through extensive experiments on both synthetic data and real-world genomic data.

Cite this Paper

BibTeX


@InProceedings{pmlr-v70-wang17d,
  title = 	 {Robust {G}aussian Graphical Model Estimation with Arbitrary Corruption},
  author =       {Lingxiao Wang and Quanquan Gu},
  booktitle = 	 {Proceedings of the 34th International Conference on Machine Learning},
  pages = 	 {3617--3626},
  year = 	 {2017},
  editor = 	 {Precup, Doina and Teh, Yee Whye},
  volume = 	 {70},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--11 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v70/wang17d/wang17d.pdf},
  url = 	 {https://proceedings.mlr.press/v70/wang17d.html},
  abstract = 	 {We study the problem of estimating the high-dimensional Gaussian graphical model where the data are arbitrarily corrupted. We propose a robust estimator for the sparse precision matrix in the high-dimensional regime. At the core of our method is a robust covariance matrix estimator, which is based on truncated inner product. We establish the statistical guarantee of our estimator on both estimation error and model selection consistency. In particular, we show that provided that the number of corrupted samples $n_2$ for each variable satisfies $n_2 \lesssim \sqrt{n}/\sqrt{\log d}$, where $n$ is the sample size and $d$ is the number of variables, the proposed robust precision matrix estimator attains the same statistical rate as the standard estimator for Gaussian graphical models. In addition, we propose a hypothesis testing procedure to assess the uncertainty of our robust estimator. We demonstrate the effectiveness of our method through extensive experiments on both synthetic data and real-world genomic data.}
}

Endnote

%0 Conference Paper
%T Robust Gaussian Graphical Model Estimation with Arbitrary Corruption
%A Lingxiao Wang
%A Quanquan Gu
%B Proceedings of the 34th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2017
%E Doina Precup
%E Yee Whye Teh	
%F pmlr-v70-wang17d
%I PMLR
%P 3617--3626
%U https://proceedings.mlr.press/v70/wang17d.html
%V 70
%X We study the problem of estimating the high-dimensional Gaussian graphical model where the data are arbitrarily corrupted. We propose a robust estimator for the sparse precision matrix in the high-dimensional regime. At the core of our method is a robust covariance matrix estimator, which is based on truncated inner product. We establish the statistical guarantee of our estimator on both estimation error and model selection consistency. In particular, we show that provided that the number of corrupted samples $n_2$ for each variable satisfies $n_2 \lesssim \sqrt{n}/\sqrt{\log d}$, where $n$ is the sample size and $d$ is the number of variables, the proposed robust precision matrix estimator attains the same statistical rate as the standard estimator for Gaussian graphical models. In addition, we propose a hypothesis testing procedure to assess the uncertainty of our robust estimator. We demonstrate the effectiveness of our method through extensive experiments on both synthetic data and real-world genomic data.

APA


Wang, L. & Gu, Q.. (2017). Robust Gaussian Graphical Model Estimation with Arbitrary Corruption. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:3617-3626 Available from https://proceedings.mlr.press/v70/wang17d.html.

Related Material

Download PDF