Learning Mixtures of Gaussians with Censored Data

Wai Ming Tai, Bryon Aragam
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:33396-33415, 2023.

Abstract

We study the problem of learning mixtures of Gaussians with censored data. Statistical learning with censored data is a classical problem, with numerous practical applications, however, finite-sample guarantees for even simple latent variable models such as Gaussian mixtures are missing. Formally, we are given censored data from a mixture of univariate Gaussians ki=1wiN(μi,σ2), i.e. the sample is observed only if it lies inside a set S. The goal is to learn the weights wi and the means μi. We propose an algorithm that takes only 1εO(k) samples to estimate the weights wi and the means μi within ε error.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-tai23a, title = {Learning Mixtures of {G}aussians with Censored Data}, author = {Tai, Wai Ming and Aragam, Bryon}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {33396--33415}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/tai23a/tai23a.pdf}, url = {https://proceedings.mlr.press/v202/tai23a.html}, abstract = {We study the problem of learning mixtures of Gaussians with censored data. Statistical learning with censored data is a classical problem, with numerous practical applications, however, finite-sample guarantees for even simple latent variable models such as Gaussian mixtures are missing. Formally, we are given censored data from a mixture of univariate Gaussians $ \sum_{i=1}^k w_i \mathcal{N}(\mu_i,\sigma^2), $ i.e. the sample is observed only if it lies inside a set $S$. The goal is to learn the weights $w_i$ and the means $\mu_i$. We propose an algorithm that takes only $\frac{1}{\varepsilon^{O(k)}}$ samples to estimate the weights $w_i$ and the means $\mu_i$ within $\varepsilon$ error.} }
Endnote
%0 Conference Paper %T Learning Mixtures of Gaussians with Censored Data %A Wai Ming Tai %A Bryon Aragam %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-tai23a %I PMLR %P 33396--33415 %U https://proceedings.mlr.press/v202/tai23a.html %V 202 %X We study the problem of learning mixtures of Gaussians with censored data. Statistical learning with censored data is a classical problem, with numerous practical applications, however, finite-sample guarantees for even simple latent variable models such as Gaussian mixtures are missing. Formally, we are given censored data from a mixture of univariate Gaussians $ \sum_{i=1}^k w_i \mathcal{N}(\mu_i,\sigma^2), $ i.e. the sample is observed only if it lies inside a set $S$. The goal is to learn the weights $w_i$ and the means $\mu_i$. We propose an algorithm that takes only $\frac{1}{\varepsilon^{O(k)}}$ samples to estimate the weights $w_i$ and the means $\mu_i$ within $\varepsilon$ error.
APA
Tai, W.M. & Aragam, B.. (2023). Learning Mixtures of Gaussians with Censored Data. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:33396-33415 Available from https://proceedings.mlr.press/v202/tai23a.html.

Related Material