Learning general Gaussian mixtures with efficient score matching

Sitan Chen; Vasilis Kontonis; Kulin Shah

Learning general Gaussian mixtures with efficient score matching

Sitan Chen, Vasilis Kontonis, Kulin Shah

Proceedings of Thirty Eighth Conference on Learning Theory, PMLR 291:1029-1090, 2025.

Abstract

We study the problem of learning mixtures of $k$ Gaussians in $d$ dimensions. We make no separation assumptions on the underlying mixture components: we only require that the covariance matrices have bounded condition number and that the means and covariances lie in a ball of bounded radius. We give an algorithm that draws $d^{\textrm{poly}(k/\epsilon)}$ samples from the target mixture, runs in sample-polynomial time, and constructs a sampler whose output distribution is $\epsilon$-close from the unknown mixture in total variation. Prior works for this problem either (i) required exponential runtime in the dimension $d$, (ii) placed strong assumptions on the instance (e.g., spherical covariances or clusterability), or (iii) had doubly exponential dependence on the number of components $k$. Our approach departs from commonly used techniques for this problem like the method of moments. Instead, we leverage a recently developed reduction, based on diffusion models, from distribution learning to a supervised learning task called score matching. We give an algorithm for the latter by proving a structural result showing that the score function of a Gaussian mixture can be approximated by a piecewise-polynomial function, and there is an efficient algorithm for finding it. To our knowledge, this is the first example of diffusion models achieving a state-of-the-art theoretical guarantee for an unsupervised learning task.

Cite this Paper

BibTeX

@InProceedings{pmlr-v291-chen25e,
  title = 	 {Learning general Gaussian mixtures with efficient score matching},
  author =       {Chen, Sitan and Kontonis, Vasilis and Shah, Kulin},
  booktitle = 	 {Proceedings of Thirty Eighth Conference on Learning Theory},
  pages = 	 {1029--1090},
  year = 	 {2025},
  editor = 	 {Haghtalab, Nika and Moitra, Ankur},
  volume = 	 {291},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {30 Jun--04 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v291/main/assets/chen25e/chen25e.pdf},
  url = 	 {https://proceedings.mlr.press/v291/chen25e.html},
  abstract = 	 {We study the problem of learning mixtures of $k$ Gaussians in $d$ dimensions.  We make no separation assumptions on the underlying mixture components:  we only require that the covariance matrices have bounded condition number  and that the means and covariances lie in a ball of bounded radius. We give an algorithm that draws $d^{\textrm{poly}(k/\epsilon)}$ samples from the target mixture,  runs in sample-polynomial time, and constructs a sampler whose output distribution is $\epsilon$-close from the unknown mixture in total variation.  Prior works for this problem either (i) required exponential runtime in the dimension $d$, (ii) placed strong assumptions on the instance (e.g., spherical covariances or clusterability), or (iii) had doubly exponential dependence on the number of components $k$.  Our approach departs from commonly used techniques for this problem like the method of moments. Instead, we leverage a recently developed reduction, based on diffusion models, from distribution learning to a supervised learning task called score matching. We give an algorithm for the latter by proving a structural result showing that the score function of a Gaussian mixture can be approximated by a piecewise-polynomial function, and there is an efficient algorithm for finding it. To our knowledge, this is the first example of diffusion models achieving a state-of-the-art theoretical guarantee for an unsupervised learning task.}
}

Endnote

%0 Conference Paper
%T Learning general Gaussian mixtures with efficient score matching
%A Sitan Chen
%A Vasilis Kontonis
%A Kulin Shah
%B Proceedings of Thirty Eighth Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2025
%E Nika Haghtalab
%E Ankur Moitra	
%F pmlr-v291-chen25e
%I PMLR
%P 1029--1090
%U https://proceedings.mlr.press/v291/chen25e.html
%V 291
%X We study the problem of learning mixtures of $k$ Gaussians in $d$ dimensions.  We make no separation assumptions on the underlying mixture components:  we only require that the covariance matrices have bounded condition number  and that the means and covariances lie in a ball of bounded radius. We give an algorithm that draws $d^{\textrm{poly}(k/\epsilon)}$ samples from the target mixture,  runs in sample-polynomial time, and constructs a sampler whose output distribution is $\epsilon$-close from the unknown mixture in total variation.  Prior works for this problem either (i) required exponential runtime in the dimension $d$, (ii) placed strong assumptions on the instance (e.g., spherical covariances or clusterability), or (iii) had doubly exponential dependence on the number of components $k$.  Our approach departs from commonly used techniques for this problem like the method of moments. Instead, we leverage a recently developed reduction, based on diffusion models, from distribution learning to a supervised learning task called score matching. We give an algorithm for the latter by proving a structural result showing that the score function of a Gaussian mixture can be approximated by a piecewise-polynomial function, and there is an efficient algorithm for finding it. To our knowledge, this is the first example of diffusion models achieving a state-of-the-art theoretical guarantee for an unsupervised learning task.

APA

Chen, S., Kontonis, V. & Shah, K.. (2025). Learning general Gaussian mixtures with efficient score matching. Proceedings of Thirty Eighth Conference on Learning Theory, in Proceedings of Machine Learning Research 291:1029-1090 Available from https://proceedings.mlr.press/v291/chen25e.html.

Related Material

Download PDF