On the Minimax Optimality of the EM Algorithm for Learning Two-Component Mixed Linear Regression

Jeongyeol Kwon, Nhat Ho, Constantine Caramanis
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:1405-1413, 2021.

Abstract

We study the convergence rates of the EM algorithm for learning two-component mixed linear regression under all regimes of signal-to-noise ratio (SNR). We resolve a long-standing question that many recent results have attempted to tackle: we completely characterize the convergence behavior of EM, and show that the EM algorithm achieves minimax optimal sample complexity under all SNR regimes. In particular, when the SNR is sufficiently large, the EM updates converge to the true parameter $\theta^{*}$ at the standard parametric convergence rate $\calo((d/n)^{1/2})$ after $\calo(\log(n/d))$ iterations. In the regime where the SNR is above $\calo((d/n)^{1/4})$ and below some constant, the EM iterates converge to a $\calo({\rm SNR}^{-1} (d/n)^{1/2})$ neighborhood of the true parameter, when the number of iterations is of the order $\calo({\rm SNR}^{-2} \log(n/d))$. In the low SNR regime where the SNR is below $\calo((d/n)^{1/4})$, we show that EM converges to a $\calo((d/n)^{1/4})$ neighborhood of the true parameters, after $\calo((n/d)^{1/2})$ iterations. Notably, these results are achieved under mild conditions of either random initialization or an efficiently computable local initialization. By providing tight convergence guarantees of the EM algorithm in middle-to-low SNR regimes, we fill the remaining gap in the literature, and significantly, reveal that in low SNR, EM changes rate, matching the $n^{-1/4}$ rate of the MLE, a behavior that previous work had been unable to show.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-kwon21b, title = { On the Minimax Optimality of the EM Algorithm for Learning Two-Component Mixed Linear Regression }, author = {Kwon, Jeongyeol and Ho, Nhat and Caramanis, Constantine}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {1405--1413}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/kwon21b/kwon21b.pdf}, url = {https://proceedings.mlr.press/v130/kwon21b.html}, abstract = { We study the convergence rates of the EM algorithm for learning two-component mixed linear regression under all regimes of signal-to-noise ratio (SNR). We resolve a long-standing question that many recent results have attempted to tackle: we completely characterize the convergence behavior of EM, and show that the EM algorithm achieves minimax optimal sample complexity under all SNR regimes. In particular, when the SNR is sufficiently large, the EM updates converge to the true parameter $\theta^{*}$ at the standard parametric convergence rate $\calo((d/n)^{1/2})$ after $\calo(\log(n/d))$ iterations. In the regime where the SNR is above $\calo((d/n)^{1/4})$ and below some constant, the EM iterates converge to a $\calo({\rm SNR}^{-1} (d/n)^{1/2})$ neighborhood of the true parameter, when the number of iterations is of the order $\calo({\rm SNR}^{-2} \log(n/d))$. In the low SNR regime where the SNR is below $\calo((d/n)^{1/4})$, we show that EM converges to a $\calo((d/n)^{1/4})$ neighborhood of the true parameters, after $\calo((n/d)^{1/2})$ iterations. Notably, these results are achieved under mild conditions of either random initialization or an efficiently computable local initialization. By providing tight convergence guarantees of the EM algorithm in middle-to-low SNR regimes, we fill the remaining gap in the literature, and significantly, reveal that in low SNR, EM changes rate, matching the $n^{-1/4}$ rate of the MLE, a behavior that previous work had been unable to show. } }
Endnote
%0 Conference Paper %T On the Minimax Optimality of the EM Algorithm for Learning Two-Component Mixed Linear Regression %A Jeongyeol Kwon %A Nhat Ho %A Constantine Caramanis %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-kwon21b %I PMLR %P 1405--1413 %U https://proceedings.mlr.press/v130/kwon21b.html %V 130 %X We study the convergence rates of the EM algorithm for learning two-component mixed linear regression under all regimes of signal-to-noise ratio (SNR). We resolve a long-standing question that many recent results have attempted to tackle: we completely characterize the convergence behavior of EM, and show that the EM algorithm achieves minimax optimal sample complexity under all SNR regimes. In particular, when the SNR is sufficiently large, the EM updates converge to the true parameter $\theta^{*}$ at the standard parametric convergence rate $\calo((d/n)^{1/2})$ after $\calo(\log(n/d))$ iterations. In the regime where the SNR is above $\calo((d/n)^{1/4})$ and below some constant, the EM iterates converge to a $\calo({\rm SNR}^{-1} (d/n)^{1/2})$ neighborhood of the true parameter, when the number of iterations is of the order $\calo({\rm SNR}^{-2} \log(n/d))$. In the low SNR regime where the SNR is below $\calo((d/n)^{1/4})$, we show that EM converges to a $\calo((d/n)^{1/4})$ neighborhood of the true parameters, after $\calo((n/d)^{1/2})$ iterations. Notably, these results are achieved under mild conditions of either random initialization or an efficiently computable local initialization. By providing tight convergence guarantees of the EM algorithm in middle-to-low SNR regimes, we fill the remaining gap in the literature, and significantly, reveal that in low SNR, EM changes rate, matching the $n^{-1/4}$ rate of the MLE, a behavior that previous work had been unable to show.
APA
Kwon, J., Ho, N. & Caramanis, C.. (2021). On the Minimax Optimality of the EM Algorithm for Learning Two-Component Mixed Linear Regression . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:1405-1413 Available from https://proceedings.mlr.press/v130/kwon21b.html.

Related Material