Dimension Reduction via Sum-of-Squares and Improved Clustering Algorithms for Non-Spherical Mixtures

Prashanti Anderson, Mitali Bafna, Rares-Darius Buhai, Pravesh K. Kothari, David Steurer
Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:224-289, 2026.

Abstract

We develop a new approach for clustering non-spherical (i.e., arbitrary component covariances) Gaussian mixture models via a subroutine based on the sum-of-squares method that finds a low-dimensional separation-preserving projection of the input data. Our method provides a non-spherical analog of the classical dimension reduction based on singular value decomposition that, among several other applications, forms a key component of the celebrated spherical clustering algorithm of Vempala and Wang (2004). As applications, we obtain an algorithm to (1) cluster an arbitrary total-variation separated mixture of $k$ centered (i.e., zero-mean) Gaussians with $n\geq \mathrm{poly}(d) f(w_{\min}^{-1})$ samples and $\mathrm{poly}(n)$ time, and (2) cluster an arbitrary total-variation separated mixture of $k$ Gaussians with identical but arbitrary unknown covariance with $n \geq d^{O(\log w_{\min}^{-1})} f(w_{\min}^{-1})$ samples and $n^{O(\log w_{\min}^{-1})}$ time. Here, $w_{\min}$ is the minimum mixing weight of the input mixture, and $f$ does not depend on the dimension $d$. Our algorithms naturally extend to tolerate a dimension-independent fraction of arbitrary outliers. Before this work, the techniques in the state-of-the-art non-spherical clustering algorithms needed $d^{O(k)} f(w_{\min}^{-1})$ samples and time for clustering such mixtures. Our results may come as a surprise in the context of the $d^{\Omega(k)}$ statistical query and sum-of-squares lower bounds (Diakonikolas et al. (2017, 2024)) for clustering non-spherical Gaussian mixtures. While these results are usually thought to rule out $d^{o(k)}$ cost algorithms for the problem, our results show that the lower bounds can, in fact, be circumvented for a remarkably general class of Gaussian mixtures.

Cite this Paper


BibTeX
@InProceedings{pmlr-v336-anderson26a, title = {Dimension Reduction via Sum-of-Squares and Improved Clustering Algorithms for Non-Spherical Mixtures}, author = {Anderson, Prashanti and Bafna, Mitali and Buhai, Rares-Darius and Kothari, Pravesh K. and Steurer, David}, booktitle = {Proceedings of Thirty Ninth Conference on Learning Theory}, pages = {224--289}, year = {2026}, editor = {Hanneke, Steve and Lattimore, Tor}, volume = {336}, series = {Proceedings of Machine Learning Research}, month = {29 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v336/main/assets/anderson26a/anderson26a.pdf}, url = {https://proceedings.mlr.press/v336/anderson26a.html}, abstract = {We develop a new approach for clustering non-spherical (i.e., arbitrary component covariances) Gaussian mixture models via a subroutine based on the sum-of-squares method that finds a low-dimensional separation-preserving projection of the input data. Our method provides a non-spherical analog of the classical dimension reduction based on singular value decomposition that, among several other applications, forms a key component of the celebrated spherical clustering algorithm of Vempala and Wang (2004). As applications, we obtain an algorithm to (1) cluster an arbitrary total-variation separated mixture of $k$ centered (i.e., zero-mean) Gaussians with $n\geq \mathrm{poly}(d) f(w_{\min}^{-1})$ samples and $\mathrm{poly}(n)$ time, and (2) cluster an arbitrary total-variation separated mixture of $k$ Gaussians with identical but arbitrary unknown covariance with $n \geq d^{O(\log w_{\min}^{-1})} f(w_{\min}^{-1})$ samples and $n^{O(\log w_{\min}^{-1})}$ time. Here, $w_{\min}$ is the minimum mixing weight of the input mixture, and $f$ does not depend on the dimension $d$. Our algorithms naturally extend to tolerate a dimension-independent fraction of arbitrary outliers. Before this work, the techniques in the state-of-the-art non-spherical clustering algorithms needed $d^{O(k)} f(w_{\min}^{-1})$ samples and time for clustering such mixtures. Our results may come as a surprise in the context of the $d^{\Omega(k)}$ statistical query and sum-of-squares lower bounds (Diakonikolas et al. (2017, 2024)) for clustering non-spherical Gaussian mixtures. While these results are usually thought to rule out $d^{o(k)}$ cost algorithms for the problem, our results show that the lower bounds can, in fact, be circumvented for a remarkably general class of Gaussian mixtures.} }
Endnote
%0 Conference Paper %T Dimension Reduction via Sum-of-Squares and Improved Clustering Algorithms for Non-Spherical Mixtures %A Prashanti Anderson %A Mitali Bafna %A Rares-Darius Buhai %A Pravesh K. Kothari %A David Steurer %B Proceedings of Thirty Ninth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2026 %E Steve Hanneke %E Tor Lattimore %F pmlr-v336-anderson26a %I PMLR %P 224--289 %U https://proceedings.mlr.press/v336/anderson26a.html %V 336 %X We develop a new approach for clustering non-spherical (i.e., arbitrary component covariances) Gaussian mixture models via a subroutine based on the sum-of-squares method that finds a low-dimensional separation-preserving projection of the input data. Our method provides a non-spherical analog of the classical dimension reduction based on singular value decomposition that, among several other applications, forms a key component of the celebrated spherical clustering algorithm of Vempala and Wang (2004). As applications, we obtain an algorithm to (1) cluster an arbitrary total-variation separated mixture of $k$ centered (i.e., zero-mean) Gaussians with $n\geq \mathrm{poly}(d) f(w_{\min}^{-1})$ samples and $\mathrm{poly}(n)$ time, and (2) cluster an arbitrary total-variation separated mixture of $k$ Gaussians with identical but arbitrary unknown covariance with $n \geq d^{O(\log w_{\min}^{-1})} f(w_{\min}^{-1})$ samples and $n^{O(\log w_{\min}^{-1})}$ time. Here, $w_{\min}$ is the minimum mixing weight of the input mixture, and $f$ does not depend on the dimension $d$. Our algorithms naturally extend to tolerate a dimension-independent fraction of arbitrary outliers. Before this work, the techniques in the state-of-the-art non-spherical clustering algorithms needed $d^{O(k)} f(w_{\min}^{-1})$ samples and time for clustering such mixtures. Our results may come as a surprise in the context of the $d^{\Omega(k)}$ statistical query and sum-of-squares lower bounds (Diakonikolas et al. (2017, 2024)) for clustering non-spherical Gaussian mixtures. While these results are usually thought to rule out $d^{o(k)}$ cost algorithms for the problem, our results show that the lower bounds can, in fact, be circumvented for a remarkably general class of Gaussian mixtures.
APA
Anderson, P., Bafna, M., Buhai, R., Kothari, P.K. & Steurer, D.. (2026). Dimension Reduction via Sum-of-Squares and Improved Clustering Algorithms for Non-Spherical Mixtures. Proceedings of Thirty Ninth Conference on Learning Theory, in Proceedings of Machine Learning Research 336:224-289 Available from https://proceedings.mlr.press/v336/anderson26a.html.

Related Material