[edit]
Mixtures of Gaussians are Privately Learnable with a Polynomial Number of Samples
Proceedings of The 35th International Conference on Algorithmic Learning Theory, PMLR 237:47-73, 2024.
Abstract
We study the problem of estimating mixtures of Gaussians under the constraint of differential privacy (DP). Our main result is that poly(k,d,1/α,1/ε,log(1/δ)) samples are sufficient to estimate a mixture of k Gaussians in Rd up to total variation distance α while satisfying (ε,δ)-DP. This is the first finite sample complexity upper bound for the problem that does not make any structural assumptions on the GMMs. To solve the problem, we devise a new framework which may be useful for other tasks. On a high level, we show that if a class of distributions (such as Gaussians) is (1) list decodable and (2) admits a “locally small” cover (Bun et al., 2021) with respect to total variation distance, then the class of its mixtures is privately learnable. The proof circumvents a known barrier indicating that, unlike Gaussians, GMMs do not admit a locally small cover (Aden-Ali et al., 2021b).