Generalizing Causal Effects from Randomized Controlled Trials to Target Populations across Diverse Environments

Baohong Li, Yingrong Wang, Anpeng Wu, Ming Ma, Ruoxuan Xiong, Kun Kuang
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:36170-36191, 2025.

Abstract

Generalizing causal effects from Randomized Controlled Trials (RCTs) to target populations across diverse environments is of significant practical importance, as RCTs are often costly and logistically complex to conduct. A key challenge is environmental shift, defined as changes in the distribution and availability of covariates between source and target environments. A common approach addressing this challenge is to identify a separating set–covariates that govern both treatment effect heterogeneity and environmental differences–and combine RCT samples with target populations matched on this set. However, this approach assumes that the separating set is fully observed and shared across datasets, an assumption often violated in practice. We propose a novel Two-Stage Doubly Robust (2SDR) method that relaxes this assumption by allowing the separating set to be observed in only one of the two datasets. 2SDR leverages shadow variables to impute missing components of the separating set and generalize treatment effects across environments in a two-stage procedure. We show the identification of causal effects in target environments under 2SDR and demonstrate its effectiveness through extensive experiments on both synthetic and real-world datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-li25cr, title = {Generalizing Causal Effects from Randomized Controlled Trials to Target Populations across Diverse Environments}, author = {Li, Baohong and Wang, Yingrong and Wu, Anpeng and Ma, Ming and Xiong, Ruoxuan and Kuang, Kun}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {36170--36191}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/li25cr/li25cr.pdf}, url = {https://proceedings.mlr.press/v267/li25cr.html}, abstract = {Generalizing causal effects from Randomized Controlled Trials (RCTs) to target populations across diverse environments is of significant practical importance, as RCTs are often costly and logistically complex to conduct. A key challenge is environmental shift, defined as changes in the distribution and availability of covariates between source and target environments. A common approach addressing this challenge is to identify a separating set–covariates that govern both treatment effect heterogeneity and environmental differences–and combine RCT samples with target populations matched on this set. However, this approach assumes that the separating set is fully observed and shared across datasets, an assumption often violated in practice. We propose a novel Two-Stage Doubly Robust (2SDR) method that relaxes this assumption by allowing the separating set to be observed in only one of the two datasets. 2SDR leverages shadow variables to impute missing components of the separating set and generalize treatment effects across environments in a two-stage procedure. We show the identification of causal effects in target environments under 2SDR and demonstrate its effectiveness through extensive experiments on both synthetic and real-world datasets.} }
Endnote
%0 Conference Paper %T Generalizing Causal Effects from Randomized Controlled Trials to Target Populations across Diverse Environments %A Baohong Li %A Yingrong Wang %A Anpeng Wu %A Ming Ma %A Ruoxuan Xiong %A Kun Kuang %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-li25cr %I PMLR %P 36170--36191 %U https://proceedings.mlr.press/v267/li25cr.html %V 267 %X Generalizing causal effects from Randomized Controlled Trials (RCTs) to target populations across diverse environments is of significant practical importance, as RCTs are often costly and logistically complex to conduct. A key challenge is environmental shift, defined as changes in the distribution and availability of covariates between source and target environments. A common approach addressing this challenge is to identify a separating set–covariates that govern both treatment effect heterogeneity and environmental differences–and combine RCT samples with target populations matched on this set. However, this approach assumes that the separating set is fully observed and shared across datasets, an assumption often violated in practice. We propose a novel Two-Stage Doubly Robust (2SDR) method that relaxes this assumption by allowing the separating set to be observed in only one of the two datasets. 2SDR leverages shadow variables to impute missing components of the separating set and generalize treatment effects across environments in a two-stage procedure. We show the identification of causal effects in target environments under 2SDR and demonstrate its effectiveness through extensive experiments on both synthetic and real-world datasets.
APA
Li, B., Wang, Y., Wu, A., Ma, M., Xiong, R. & Kuang, K.. (2025). Generalizing Causal Effects from Randomized Controlled Trials to Target Populations across Diverse Environments. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:36170-36191 Available from https://proceedings.mlr.press/v267/li25cr.html.

Related Material