[edit]

# Hidden Integrality of SDP Relaxations for Sub-Gaussian Mixture Models

*Proceedings of the 31st Conference On Learning Theory*, PMLR 75:1931-1965, 2018.

#### Abstract

We consider the problem of finding discrete clustering structures under Sub-Gaussian Mixture Models. We establish a hidden integrality property of a semidefinite programming (SDP) relaxation for this problem: while the optimal solutions to the SDP are not integer-valued in general, their estimation errors can be upper bounded by the error of an idealized integer program. The error of the integer program, and hence that of the SDP, are further shown to decay exponentially in the signal-to-noise ratio. To the best of our knowledge, this is the first exponentially decaying error bound for convex relaxations of mixture models. A special case of this result shows that in certain regimes the SDP solutions are in fact integral and exact, improving on existing exact recovery results for convex relaxations. More generally, our result establishes sufficient conditions for the SDP to correctly recover the cluster memberships of at least $(1-\delta)$ fraction of the points for any $\delta\in(0,1)$. Error bounds for estimating cluster centers can also be derived directly from our results.