Alternating Minimization Converges Super-Linearly for Mixed Linear Regression

Avishek Ghosh, Ramchandran Kannan
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:1093-1103, 2020.

Abstract

We address the problem of solving mixed random linear equations. In this problem, we have unlabeled observations coming from multiple linear regressions, and each observation corresponds to exactly one of the regression models. The goal is to learn the linear regressors from the observations. Classically, Alternating Minimization (AM) (which may be thought as a variant of Expectation Maximization (EM)) is used to solve this problem. AM iteratively alternates between the estimation of labels and solving the regression problems with the estimated labels. Empirically, it is observed that, for a large variety of non-convex problems including mixed linear regression, AM converges at a much faster rate compared to gradient based algorithms. However, the existing theory suggests similar rate of convergence, failing to capture this empirical behavior. In this paper, we close this gap between theory and practice for the special case of a mixture of $2$ linear regressions. We show that, provided initialized properly, AM enjoys a \emph{super-linear} rate of convergence. To the best of our knowledge, this is the first work that theoretically establishes such rate for AM. Hence, if we want to recover the unknown regressors upto an error (in $\ell_2$ norm) of $\epsilon$, AM only takes $\mathcal{O}(\log \log (1/\epsilon))$ iterations.

Cite this Paper


BibTeX
@InProceedings{pmlr-v108-ghosh20a, title = {Alternating Minimization Converges Super-Linearly for Mixed Linear Regression}, author = {Ghosh, Avishek and Kannan, Ramchandran}, booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, pages = {1093--1103}, year = {2020}, editor = {Chiappa, Silvia and Calandra, Roberto}, volume = {108}, series = {Proceedings of Machine Learning Research}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/ghosh20a/ghosh20a.pdf}, url = {https://proceedings.mlr.press/v108/ghosh20a.html}, abstract = {We address the problem of solving mixed random linear equations. In this problem, we have unlabeled observations coming from multiple linear regressions, and each observation corresponds to exactly one of the regression models. The goal is to learn the linear regressors from the observations. Classically, Alternating Minimization (AM) (which may be thought as a variant of Expectation Maximization (EM)) is used to solve this problem. AM iteratively alternates between the estimation of labels and solving the regression problems with the estimated labels. Empirically, it is observed that, for a large variety of non-convex problems including mixed linear regression, AM converges at a much faster rate compared to gradient based algorithms. However, the existing theory suggests similar rate of convergence, failing to capture this empirical behavior. In this paper, we close this gap between theory and practice for the special case of a mixture of $2$ linear regressions. We show that, provided initialized properly, AM enjoys a \emph{super-linear} rate of convergence. To the best of our knowledge, this is the first work that theoretically establishes such rate for AM. Hence, if we want to recover the unknown regressors upto an error (in $\ell_2$ norm) of $\epsilon$, AM only takes $\mathcal{O}(\log \log (1/\epsilon))$ iterations.} }
Endnote
%0 Conference Paper %T Alternating Minimization Converges Super-Linearly for Mixed Linear Regression %A Avishek Ghosh %A Ramchandran Kannan %B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2020 %E Silvia Chiappa %E Roberto Calandra %F pmlr-v108-ghosh20a %I PMLR %P 1093--1103 %U https://proceedings.mlr.press/v108/ghosh20a.html %V 108 %X We address the problem of solving mixed random linear equations. In this problem, we have unlabeled observations coming from multiple linear regressions, and each observation corresponds to exactly one of the regression models. The goal is to learn the linear regressors from the observations. Classically, Alternating Minimization (AM) (which may be thought as a variant of Expectation Maximization (EM)) is used to solve this problem. AM iteratively alternates between the estimation of labels and solving the regression problems with the estimated labels. Empirically, it is observed that, for a large variety of non-convex problems including mixed linear regression, AM converges at a much faster rate compared to gradient based algorithms. However, the existing theory suggests similar rate of convergence, failing to capture this empirical behavior. In this paper, we close this gap between theory and practice for the special case of a mixture of $2$ linear regressions. We show that, provided initialized properly, AM enjoys a \emph{super-linear} rate of convergence. To the best of our knowledge, this is the first work that theoretically establishes such rate for AM. Hence, if we want to recover the unknown regressors upto an error (in $\ell_2$ norm) of $\epsilon$, AM only takes $\mathcal{O}(\log \log (1/\epsilon))$ iterations.
APA
Ghosh, A. & Kannan, R.. (2020). Alternating Minimization Converges Super-Linearly for Mixed Linear Regression. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:1093-1103 Available from https://proceedings.mlr.press/v108/ghosh20a.html.

Related Material