Mixtures of Gaussians are Privately Learnable with a Polynomial Number of Samples

Mohammad Afzali; Hassan Ashtiani; Christopher Liaw

Mixtures of Gaussians are Privately Learnable with a Polynomial Number of Samples

Mohammad Afzali, Hassan Ashtiani, Christopher Liaw

Proceedings of The 35th International Conference on Algorithmic Learning Theory, PMLR 237:47-73, 2024.

Abstract

We study the problem of estimating mixtures of Gaussians under the constraint of differential privacy (DP). Our main result is that

$\text{poly}(k,d,1/\alpha,1/\varepsilon,\log(1/\delta))$ samples are sufficient to estimate a mixture of

$k$ Gaussians in

$\mathbb{R}^d$ up to total variation distance

$\alpha$ while satisfying

$(\varepsilon, \delta)$ -DP. This is the first finite sample complexity upper bound for the problem that does not make any structural assumptions on the GMMs. To solve the problem, we devise a new framework which may be useful for other tasks. On a high level, we show that if a class of distributions (such as Gaussians) is (1) list decodable and (2) admits a “locally small” cover (Bun et al., 2021) with respect to total variation distance, then the class of its mixtures is privately learnable. The proof circumvents a known barrier indicating that, unlike Gaussians, GMMs do not admit a locally small cover (Aden-Ali et al., 2021b).

Cite this Paper

BibTeX


@InProceedings{pmlr-v237-afzali24a,
  title = 	 {Mixtures of Gaussians are Privately Learnable with a Polynomial Number of Samples},
  author =       {Afzali, Mohammad and Ashtiani, Hassan and Liaw, Christopher},
  booktitle = 	 {Proceedings of The 35th International Conference on Algorithmic Learning Theory},
  pages = 	 {47--73},
  year = 	 {2024},
  editor = 	 {Vernade, Claire and Hsu, Daniel},
  volume = 	 {237},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--28 Feb},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v237/afzali24a/afzali24a.pdf},
  url = 	 {https://proceedings.mlr.press/v237/afzali24a.html},
  abstract = 	 {We study the problem of estimating mixtures of Gaussians under the constraint of differential privacy (DP). Our main result is that $\text{poly}(k,d,1/\alpha,1/\varepsilon,\log(1/\delta))$ samples are sufficient to estimate a mixture of $k$ Gaussians in $\mathbb{R}^d$ up to total variation distance $\alpha$ while satisfying $(\varepsilon, \delta)$-DP. This is the first finite sample complexity upper bound for the problem that does not make any structural assumptions on the GMMs. To solve the problem, we devise a new framework which may be useful for other tasks. On a high level, we show that if a class of distributions (such as Gaussians) is (1) list decodable and (2) admits a “locally small” cover (Bun et al., 2021) with respect to total variation distance, then the class of its mixtures is privately learnable. The proof circumvents a known barrier indicating that, unlike Gaussians, GMMs do not admit a locally small cover (Aden-Ali et al., 2021b). }
}

Endnote

%0 Conference Paper
%T Mixtures of Gaussians are Privately Learnable with a Polynomial Number of Samples
%A Mohammad Afzali
%A Hassan Ashtiani
%A Christopher Liaw
%B Proceedings of The 35th International Conference on Algorithmic Learning Theory
%C Proceedings of Machine Learning Research
%D 2024
%E Claire Vernade
%E Daniel Hsu	
%F pmlr-v237-afzali24a
%I PMLR
%P 47--73
%U https://proceedings.mlr.press/v237/afzali24a.html
%V 237
%X We study the problem of estimating mixtures of Gaussians under the constraint of differential privacy (DP). Our main result is that $\text{poly}(k,d,1/\alpha,1/\varepsilon,\log(1/\delta))$ samples are sufficient to estimate a mixture of $k$ Gaussians in $\mathbb{R}^d$ up to total variation distance $\alpha$ while satisfying $(\varepsilon, \delta)$-DP. This is the first finite sample complexity upper bound for the problem that does not make any structural assumptions on the GMMs. To solve the problem, we devise a new framework which may be useful for other tasks. On a high level, we show that if a class of distributions (such as Gaussians) is (1) list decodable and (2) admits a “locally small” cover (Bun et al., 2021) with respect to total variation distance, then the class of its mixtures is privately learnable. The proof circumvents a known barrier indicating that, unlike Gaussians, GMMs do not admit a locally small cover (Aden-Ali et al., 2021b).

APA


Afzali, M., Ashtiani, H. & Liaw, C.. (2024). Mixtures of Gaussians are Privately Learnable with a Polynomial Number of Samples. Proceedings of The 35th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 237:47-73 Available from https://proceedings.mlr.press/v237/afzali24a.html.

Related Material

Download PDF