The More, the Merrier: the Blessing of Dimensionality for Learning Large Gaussian Mixtures

Joseph Anderson; Mikhail Belkin; Navin Goyal; Luis Rademacher; James Voss

The More, the Merrier: the Blessing of Dimensionality for Learning Large Gaussian Mixtures

Joseph Anderson, Mikhail Belkin, Navin Goyal, Luis Rademacher, James Voss

Proceedings of The 27th Conference on Learning Theory, PMLR 35:1135-1164, 2014.

Abstract

In this paper we show that very large mixtures of Gaussians are efficiently learnable in high dimension. More precisely, we prove that a mixture with known identical covariance matrices whose number of components is a polynomial of any fixed degree in the dimension n is polynomially learnable as long as a certain non-degeneracy condition on the means is satisfied. It turns out that this condition is generic in the sense of smoothed complexity, as soon as the dimensionality of the space is high enough. Moreover, we prove that no such condition can possibly exist in low dimension and the problem of learning the parameters is generically hard. In contrast, much of the existing work on Gaussian Mixtures relies on low-dimensional projections and thus hits an artificial barrier. Our main result on mixture recovery relies on a new “Poissonization"-based technique, which transforms a mixture of Gaussians to a linear map of a product distribution. The problem of learning this map can be efficiently solved using some recent results on tensor decompositions and Independent Component Analysis (ICA), thus giving an algorithm for recovering the mixture. In addition, we combine our low-dimensional hardness results for Gaussian mixtures with Poissonization to show how to embed difficult instances of low-dimensional Gaussian mixtures into the ICA setting, thus establishing exponential information-theoretic lower bounds for underdetermined ICA in low dimension. To the best of our knowledge, this is the first such result in the literature. In addition to contributing to the problem of Gaussian mixture learning, we believe that this work is among the first steps toward better understanding the rare phenomenon of the “blessing of dimensionality" in the computational aspects of statistical inference.

Cite this Paper

BibTeX

@InProceedings{pmlr-v35-anderson14,
  title = 	 {The More, the Merrier: the Blessing of Dimensionality for Learning Large Gaussian Mixtures},
  author = 	 {Anderson, Joseph and Belkin, Mikhail and Goyal, Navin and Rademacher, Luis and Voss, James},
  booktitle = 	 {Proceedings of The 27th Conference on Learning Theory},
  pages = 	 {1135--1164},
  year = 	 {2014},
  editor = 	 {Balcan, Maria Florina and Feldman, Vitaly and Szepesvári, Csaba},
  volume = 	 {35},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Barcelona, Spain},
  month = 	 {13--15 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v35/anderson14.pdf},
  url = 	 {https://proceedings.mlr.press/v35/anderson14.html},
  abstract = 	 {In this paper we show that very large mixtures of Gaussians are efficiently learnable in high dimension. More precisely, we prove that a mixture with known identical covariance matrices whose number of components is a polynomial of any fixed degree in the dimension n is polynomially learnable as long as a certain non-degeneracy condition on the means is satisfied. It turns out that this condition is generic in the sense of smoothed complexity, as soon as the dimensionality of the space is high enough. Moreover, we prove that no such condition can possibly exist in low dimension and the problem of learning the parameters is generically hard.  In contrast, much of the existing work on Gaussian Mixtures relies on low-dimensional projections and thus hits an artificial barrier. Our main result on mixture recovery relies on a new “Poissonization"-based technique, which transforms a mixture of Gaussians to a linear map of a product distribution. The problem of learning this map can be efficiently solved using some recent results on tensor decompositions and Independent Component Analysis (ICA), thus giving an  algorithm for recovering the mixture. In addition, we combine our low-dimensional hardness results for Gaussian mixtures with  Poissonization to show how to embed difficult instances of low-dimensional Gaussian mixtures into the ICA setting, thus establishing exponential information-theoretic lower bounds for underdetermined ICA in low dimension. To the best of our knowledge, this is the first such result  in the literature. In addition to contributing to  the problem of Gaussian mixture learning, we believe that this work is among the first steps toward better understanding the rare phenomenon of the “blessing of dimensionality" in the computational aspects of statistical inference. }
}

Endnote

%0 Conference Paper
%T The More, the Merrier: the Blessing of Dimensionality for Learning Large Gaussian Mixtures
%A Joseph Anderson
%A Mikhail Belkin
%A Navin Goyal
%A Luis Rademacher
%A James Voss
%B Proceedings of The 27th Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2014
%E Maria Florina Balcan
%E Vitaly Feldman
%E Csaba Szepesvári	
%F pmlr-v35-anderson14
%I PMLR
%P 1135--1164
%U https://proceedings.mlr.press/v35/anderson14.html
%V 35
%X In this paper we show that very large mixtures of Gaussians are efficiently learnable in high dimension. More precisely, we prove that a mixture with known identical covariance matrices whose number of components is a polynomial of any fixed degree in the dimension n is polynomially learnable as long as a certain non-degeneracy condition on the means is satisfied. It turns out that this condition is generic in the sense of smoothed complexity, as soon as the dimensionality of the space is high enough. Moreover, we prove that no such condition can possibly exist in low dimension and the problem of learning the parameters is generically hard.  In contrast, much of the existing work on Gaussian Mixtures relies on low-dimensional projections and thus hits an artificial barrier. Our main result on mixture recovery relies on a new “Poissonization"-based technique, which transforms a mixture of Gaussians to a linear map of a product distribution. The problem of learning this map can be efficiently solved using some recent results on tensor decompositions and Independent Component Analysis (ICA), thus giving an  algorithm for recovering the mixture. In addition, we combine our low-dimensional hardness results for Gaussian mixtures with  Poissonization to show how to embed difficult instances of low-dimensional Gaussian mixtures into the ICA setting, thus establishing exponential information-theoretic lower bounds for underdetermined ICA in low dimension. To the best of our knowledge, this is the first such result  in the literature. In addition to contributing to  the problem of Gaussian mixture learning, we believe that this work is among the first steps toward better understanding the rare phenomenon of the “blessing of dimensionality" in the computational aspects of statistical inference.

RIS

TY  - CPAPER
TI  - The More, the Merrier: the Blessing of Dimensionality for Learning Large Gaussian Mixtures
AU  - Joseph Anderson
AU  - Mikhail Belkin
AU  - Navin Goyal
AU  - Luis Rademacher
AU  - James Voss
BT  - Proceedings of The 27th Conference on Learning Theory
DA  - 2014/05/29
ED  - Maria Florina Balcan
ED  - Vitaly Feldman
ED  - Csaba Szepesvári	
ID  - pmlr-v35-anderson14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 35
SP  - 1135
EP  - 1164
L1  - http://proceedings.mlr.press/v35/anderson14.pdf
UR  - https://proceedings.mlr.press/v35/anderson14.html
AB  - In this paper we show that very large mixtures of Gaussians are efficiently learnable in high dimension. More precisely, we prove that a mixture with known identical covariance matrices whose number of components is a polynomial of any fixed degree in the dimension n is polynomially learnable as long as a certain non-degeneracy condition on the means is satisfied. It turns out that this condition is generic in the sense of smoothed complexity, as soon as the dimensionality of the space is high enough. Moreover, we prove that no such condition can possibly exist in low dimension and the problem of learning the parameters is generically hard.  In contrast, much of the existing work on Gaussian Mixtures relies on low-dimensional projections and thus hits an artificial barrier. Our main result on mixture recovery relies on a new “Poissonization"-based technique, which transforms a mixture of Gaussians to a linear map of a product distribution. The problem of learning this map can be efficiently solved using some recent results on tensor decompositions and Independent Component Analysis (ICA), thus giving an  algorithm for recovering the mixture. In addition, we combine our low-dimensional hardness results for Gaussian mixtures with  Poissonization to show how to embed difficult instances of low-dimensional Gaussian mixtures into the ICA setting, thus establishing exponential information-theoretic lower bounds for underdetermined ICA in low dimension. To the best of our knowledge, this is the first such result  in the literature. In addition to contributing to  the problem of Gaussian mixture learning, we believe that this work is among the first steps toward better understanding the rare phenomenon of the “blessing of dimensionality" in the computational aspects of statistical inference. 
ER  -

APA

Anderson, J., Belkin, M., Goyal, N., Rademacher, L. & Voss, J.. (2014). The More, the Merrier: the Blessing of Dimensionality for Learning Large Gaussian Mixtures. Proceedings of The 27th Conference on Learning Theory, in Proceedings of Machine Learning Research 35:1135-1164 Available from https://proceedings.mlr.press/v35/anderson14.html.

Related Material

Download PDF