Synthetic Potential Outcomes and Causal Mixture Identifiability

Bijan Mazaheri, Chandler Squires, Caroline Uhler
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:4276-4284, 2025.

Abstract

Heterogeneous data from multiple populations, sub-groups, or sources can be represented as a "mixture model" with a single latent class influencing all of the observed covariates. Heterogeneity can be resolved at different levels by grouping populations according to different notions of similarity. This paper proposes grouping with respect to the causal response of an intervention or perturbation on the system. This is distinct from previous notions, such as grouping by similar covariate values (e.g., clustering) or similar correlations between covariates (e.g., Gaussian mixture models). To solve the problem, we "synthetically sample" from a counterfactual distribution using higher-order multi-linear moments of the observable data. To understand how these “causal mixtures” fit in with more classical notions, we develop a hierarchy of mixture identifiability.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-mazaheri25a, title = {Synthetic Potential Outcomes and Causal Mixture Identifiability}, author = {Mazaheri, Bijan and Squires, Chandler and Uhler, Caroline}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {4276--4284}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/mazaheri25a/mazaheri25a.pdf}, url = {https://proceedings.mlr.press/v258/mazaheri25a.html}, abstract = {Heterogeneous data from multiple populations, sub-groups, or sources can be represented as a "mixture model" with a single latent class influencing all of the observed covariates. Heterogeneity can be resolved at different levels by grouping populations according to different notions of similarity. This paper proposes grouping with respect to the causal response of an intervention or perturbation on the system. This is distinct from previous notions, such as grouping by similar covariate values (e.g., clustering) or similar correlations between covariates (e.g., Gaussian mixture models). To solve the problem, we "synthetically sample" from a counterfactual distribution using higher-order multi-linear moments of the observable data. To understand how these “causal mixtures” fit in with more classical notions, we develop a hierarchy of mixture identifiability.} }
Endnote
%0 Conference Paper %T Synthetic Potential Outcomes and Causal Mixture Identifiability %A Bijan Mazaheri %A Chandler Squires %A Caroline Uhler %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-mazaheri25a %I PMLR %P 4276--4284 %U https://proceedings.mlr.press/v258/mazaheri25a.html %V 258 %X Heterogeneous data from multiple populations, sub-groups, or sources can be represented as a "mixture model" with a single latent class influencing all of the observed covariates. Heterogeneity can be resolved at different levels by grouping populations according to different notions of similarity. This paper proposes grouping with respect to the causal response of an intervention or perturbation on the system. This is distinct from previous notions, such as grouping by similar covariate values (e.g., clustering) or similar correlations between covariates (e.g., Gaussian mixture models). To solve the problem, we "synthetically sample" from a counterfactual distribution using higher-order multi-linear moments of the observable data. To understand how these “causal mixtures” fit in with more classical notions, we develop a hierarchy of mixture identifiability.
APA
Mazaheri, B., Squires, C. & Uhler, C.. (2025). Synthetic Potential Outcomes and Causal Mixture Identifiability. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:4276-4284 Available from https://proceedings.mlr.press/v258/mazaheri25a.html.

Related Material