Bootstrapping EM via Power EM and Convergence in the Naive Bayes Model

Costis Daskalakis, Christos Tzamos, Manolis Zampetakis
Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:2056-2064, 2018.

Abstract

We study the convergence properties of the Expectation-Maximization algorithm in the Naive Bayes model. We show that EM can get stuck in regions of slow convergence, even when the features are binary and i.i.d. conditioning on the class label, and even under random (i.e. non worst-case) initialization. In turn, we show that EM can be bootstrapped in a pre-training step that computes a good initialization. From this initialization we show theoretically and experimentally that EM converges exponentially fast to the true model parameters. Our bootstrapping method amounts to running the EM algorithm on appropriately centered iterates of small magnitude, which as we show corresponds to effectively performing power iteration on the covariance matrix of the mixture model, although power iteration is performed under the hood by EM itself. As such, we call our bootstrapping approach “power EM.” Specifically for the case of two binary features, we show global exponentially fast convergence of EM, even without bootstrapping. Finally, as the Naive Bayes model is quite expressive, we show as corollaries of our convergence results that the EM algorithm globally converges to the true model parameters for mixtures of two Gaussians, recovering recent results of Xu et al.’2016 and Daskalakis et al. 2017.

Cite this Paper


BibTeX
@InProceedings{pmlr-v84-daskalakis18a, title = {Bootstrapping EM via Power EM and Convergence in the Naive Bayes Model}, author = {Daskalakis, Costis and Tzamos, Christos and Zampetakis, Manolis}, booktitle = {Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics}, pages = {2056--2064}, year = {2018}, editor = {Storkey, Amos and Perez-Cruz, Fernando}, volume = {84}, series = {Proceedings of Machine Learning Research}, month = {09--11 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v84/daskalakis18a/daskalakis18a.pdf}, url = {https://proceedings.mlr.press/v84/daskalakis18a.html}, abstract = {We study the convergence properties of the Expectation-Maximization algorithm in the Naive Bayes model. We show that EM can get stuck in regions of slow convergence, even when the features are binary and i.i.d. conditioning on the class label, and even under random (i.e. non worst-case) initialization. In turn, we show that EM can be bootstrapped in a pre-training step that computes a good initialization. From this initialization we show theoretically and experimentally that EM converges exponentially fast to the true model parameters. Our bootstrapping method amounts to running the EM algorithm on appropriately centered iterates of small magnitude, which as we show corresponds to effectively performing power iteration on the covariance matrix of the mixture model, although power iteration is performed under the hood by EM itself. As such, we call our bootstrapping approach “power EM.” Specifically for the case of two binary features, we show global exponentially fast convergence of EM, even without bootstrapping. Finally, as the Naive Bayes model is quite expressive, we show as corollaries of our convergence results that the EM algorithm globally converges to the true model parameters for mixtures of two Gaussians, recovering recent results of Xu et al.’2016 and Daskalakis et al. 2017.} }
Endnote
%0 Conference Paper %T Bootstrapping EM via Power EM and Convergence in the Naive Bayes Model %A Costis Daskalakis %A Christos Tzamos %A Manolis Zampetakis %B Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2018 %E Amos Storkey %E Fernando Perez-Cruz %F pmlr-v84-daskalakis18a %I PMLR %P 2056--2064 %U https://proceedings.mlr.press/v84/daskalakis18a.html %V 84 %X We study the convergence properties of the Expectation-Maximization algorithm in the Naive Bayes model. We show that EM can get stuck in regions of slow convergence, even when the features are binary and i.i.d. conditioning on the class label, and even under random (i.e. non worst-case) initialization. In turn, we show that EM can be bootstrapped in a pre-training step that computes a good initialization. From this initialization we show theoretically and experimentally that EM converges exponentially fast to the true model parameters. Our bootstrapping method amounts to running the EM algorithm on appropriately centered iterates of small magnitude, which as we show corresponds to effectively performing power iteration on the covariance matrix of the mixture model, although power iteration is performed under the hood by EM itself. As such, we call our bootstrapping approach “power EM.” Specifically for the case of two binary features, we show global exponentially fast convergence of EM, even without bootstrapping. Finally, as the Naive Bayes model is quite expressive, we show as corollaries of our convergence results that the EM algorithm globally converges to the true model parameters for mixtures of two Gaussians, recovering recent results of Xu et al.’2016 and Daskalakis et al. 2017.
APA
Daskalakis, C., Tzamos, C. & Zampetakis, M.. (2018). Bootstrapping EM via Power EM and Convergence in the Naive Bayes Model. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:2056-2064 Available from https://proceedings.mlr.press/v84/daskalakis18a.html.

Related Material