Learning Tree Structures from Noisy Data

Konstantinos E. Nikolakakis; Dionysios S. Kalogerias; Anand D. Sarwate

Learning Tree Structures from Noisy Data

Konstantinos E. Nikolakakis, Dionysios S. Kalogerias, Anand D. Sarwate

Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:1771-1782, 2019.

Abstract

We provide high-probability sample complexity guarantees for exact structure recovery of tree-structured graphical models, when only noisy observations of the respective vertex emissions are available. We assume that the hidden variables follow either an Ising model or a Gaussian graphical model, and the observables are noise-corrupted versions of the hidden variables: We consider multiplicative

$\pm 1$ binary noise for Ising models, and additive Gaussian noise for Gaussian models. Such hidden models arise naturally in a variety of applications such as physics, biology, computer science, and finance. We study the impact of measurement noise on the task of learning the underlying tree structure via the well-known \textit{Chow-Liu algorithm} and provide formal sample complexity guarantees for exact recovery. In particular, for a tree with

$p$ vertices and probability of failure

$\delta>0$ , we show that the number of necessary samples for exact structure recovery is of the order of

$\mc{O}(\log(p/\delta))$ for Ising models (which remains the \textit{same as in the noiseless case}), and

$\mc{O}(\mathrm{polylog}{(p/\delta)})$ for Gaussian models.

Cite this Paper

BibTeX


@InProceedings{pmlr-v89-nikolakakis19a,
  title = 	 {Learning Tree Structures from Noisy Data},
  author =       {Nikolakakis, Konstantinos E. and Kalogerias, Dionysios S. and Sarwate, Anand D.},
  booktitle = 	 {Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1771--1782},
  year = 	 {2019},
  editor = 	 {Chaudhuri, Kamalika and Sugiyama, Masashi},
  volume = 	 {89},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {16--18 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v89/nikolakakis19a/nikolakakis19a.pdf},
  url = 	 {https://proceedings.mlr.press/v89/nikolakakis19a.html},
  abstract = 	 {We provide high-probability sample complexity guarantees for exact structure recovery of tree-structured graphical models, when only noisy observations of the respective vertex emissions are available. We assume that the hidden variables follow either an Ising model or a Gaussian graphical model, and the observables are noise-corrupted versions of the hidden variables: We consider multiplicative $\pm 1$ binary noise for Ising models, and additive Gaussian noise for Gaussian models. Such hidden models arise naturally in a variety of applications such as physics, biology, computer science, and finance. We study the impact of measurement noise on the task of learning the underlying tree structure via the well-known \textit{Chow-Liu algorithm} and provide formal sample complexity guarantees for exact recovery. In particular, for a tree with $p$ vertices and probability of failure $\delta>0$, we show that the number of necessary samples for exact structure recovery is of the order of $\mc{O}(\log(p/\delta))$ for Ising models (which remains the \textit{same as in the noiseless case}), and $\mc{O}(\mathrm{polylog}{(p/\delta)})$ for Gaussian models.}
}

Endnote

%0 Conference Paper
%T Learning Tree Structures from Noisy Data
%A Konstantinos E. Nikolakakis
%A Dionysios S. Kalogerias
%A Anand D. Sarwate
%B Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2019
%E Kamalika Chaudhuri
%E Masashi Sugiyama	
%F pmlr-v89-nikolakakis19a
%I PMLR
%P 1771--1782
%U https://proceedings.mlr.press/v89/nikolakakis19a.html
%V 89
%X We provide high-probability sample complexity guarantees for exact structure recovery of tree-structured graphical models, when only noisy observations of the respective vertex emissions are available. We assume that the hidden variables follow either an Ising model or a Gaussian graphical model, and the observables are noise-corrupted versions of the hidden variables: We consider multiplicative $\pm 1$ binary noise for Ising models, and additive Gaussian noise for Gaussian models. Such hidden models arise naturally in a variety of applications such as physics, biology, computer science, and finance. We study the impact of measurement noise on the task of learning the underlying tree structure via the well-known \textit{Chow-Liu algorithm} and provide formal sample complexity guarantees for exact recovery. In particular, for a tree with $p$ vertices and probability of failure $\delta>0$, we show that the number of necessary samples for exact structure recovery is of the order of $\mc{O}(\log(p/\delta))$ for Ising models (which remains the \textit{same as in the noiseless case}), and $\mc{O}(\mathrm{polylog}{(p/\delta)})$ for Gaussian models.

APA


Nikolakakis, K.E., Kalogerias, D.S. & Sarwate, A.D.. (2019). Learning Tree Structures from Noisy Data. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 89:1771-1782 Available from https://proceedings.mlr.press/v89/nikolakakis19a.html.

Learning Tree Structures from Noisy Data

Abstract

Cite this Paper

Related Material