[edit]

# Weak Separation in Mixture Models and Implications for Principal Stratification

*Proceedings of The 25th International Conference on Artificial Intelligence and Statistics*, PMLR 151:5416-5458, 2022.

#### Abstract

Principal stratification is a popular framework for addressing post-randomization complications, often in conjunction with finite mixture models for estimating the causal effects of interest. Unfortunately, standard estimators of mixture parameters, like the MLE, are known to exhibit pathological behavior. We study this behavior in a simple but fundamental example, a two-component Gaussian mixture model in which only the component means and variances are unknown, and focus on the setting in which the components are weakly separated. In this case, we show that the asymptotic convergence rate of the MLE is quite poor, such as $O(n^{-1/6})$ or even $O(n^{-1/8})$. We then demonstrate via both theoretical arguments and extensive simulations that the MLE behaves like a threshold estimator in finite samples, in the sense that the MLE can give strong evidence that the means are equal when the truth is otherwise. We also explore the behavior of the MLE when the MLE is non-zero, showing that it is difficult to estimate both the sign and magnitude of the means in this case. We provide diagnostics for all of these pathologies and apply these ideas to re-analyzing two randomized evaluations of job training programs, JOBS II and Job Corps. Our results suggest that the corresponding maximum likelihood estimates should be interpreted with caution in these cases.