[edit]
Learning Causal Markov Boundaries with Mixed Observational and Experimental Data
Proceedings of The 12th International Conference on Probabilistic Graphical Models, PMLR 246:312-326, 2024.
Abstract
A frequent goal in healthcare is to estimate personalized causal effects in order to select the best treatment for a patient from observational or experimental (RCT) data (or both), where "best" is defined in terms of maximizing the expectation of the desired outcome. The first task in estimating personalized effects is selecting the optimal set of personalization covariates (causal feature selection). This set of covariates is the Markov Boundary of the outcome in the experimental distribution, also known as the Interventional Markov Boundary (IMB), and can be identified from RCT data using methods for finding Markov Boundaries. However, most RCT data are very limited in sample size and do not work well with these methods. In this work, we develop methods that combine limited experimental and large observational data to identify the IMB, and improve the estimation of conditional (personalized) causal effects. These methods extend recent results (Triantafillou et al., 2021), which were limited to discrete data, to mixed data with binary and ordinal outcomes. The methods are based on Bayesian regression models. In simulated data, we show that our methods identify the correct IMB and improve causal effect estimation.