[edit]
Dimension Reduction via Sum-of-Squares and Improved Clustering Algorithms for Non-Spherical Mixtures
Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:224-289, 2026.
Abstract
We develop a new approach for clustering non-spherical (i.e., arbitrary component covariances) Gaussian mixture models via a subroutine based on the sum-of-squares method that finds a low-dimensional separation-preserving projection of the input data. Our method provides a non-spherical analog of the classical dimension reduction based on singular value decomposition that, among several other applications, forms a key component of the celebrated spherical clustering algorithm of Vempala and Wang (2004). As applications, we obtain an algorithm to (1) cluster an arbitrary total-variation separated mixture of $k$ centered (i.e., zero-mean) Gaussians with $n\geq \mathrm{poly}(d) f(w_{\min}^{-1})$ samples and $\mathrm{poly}(n)$ time, and (2) cluster an arbitrary total-variation separated mixture of $k$ Gaussians with identical but arbitrary unknown covariance with $n \geq d^{O(\log w_{\min}^{-1})} f(w_{\min}^{-1})$ samples and $n^{O(\log w_{\min}^{-1})}$ time. Here, $w_{\min}$ is the minimum mixing weight of the input mixture, and $f$ does not depend on the dimension $d$. Our algorithms naturally extend to tolerate a dimension-independent fraction of arbitrary outliers. Before this work, the techniques in the state-of-the-art non-spherical clustering algorithms needed $d^{O(k)} f(w_{\min}^{-1})$ samples and time for clustering such mixtures. Our results may come as a surprise in the context of the $d^{\Omega(k)}$ statistical query and sum-of-squares lower bounds (Diakonikolas et al. (2017, 2024)) for clustering non-spherical Gaussian mixtures. While these results are usually thought to rule out $d^{o(k)}$ cost algorithms for the problem, our results show that the lower bounds can, in fact, be circumvented for a remarkably general class of Gaussian mixtures.