[edit]
Generalizing Causal Effects from Randomized Controlled Trials to Target Populations across Diverse Environments
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:36170-36191, 2025.
Abstract
Generalizing causal effects from Randomized Controlled Trials (RCTs) to target populations across diverse environments is of significant practical importance, as RCTs are often costly and logistically complex to conduct. A key challenge is environmental shift, defined as changes in the distribution and availability of covariates between source and target environments. A common approach addressing this challenge is to identify a separating set–covariates that govern both treatment effect heterogeneity and environmental differences–and combine RCT samples with target populations matched on this set. However, this approach assumes that the separating set is fully observed and shared across datasets, an assumption often violated in practice. We propose a novel Two-Stage Doubly Robust (2SDR) method that relaxes this assumption by allowing the separating set to be observed in only one of the two datasets. 2SDR leverages shadow variables to impute missing components of the separating set and generalize treatment effects across environments in a two-stage procedure. We show the identification of causal effects in target environments under 2SDR and demonstrate its effectiveness through extensive experiments on both synthetic and real-world datasets.