Optimizing Multi-Scale Representations to Detect Effect Heterogeneity Using Earth Observation and Computer Vision: Applications to Two Anti-Poverty RCTs

Fucheng Warren Zhu; Connor Thomas Jerzak; Adel Daoud

Optimizing Multi-Scale Representations to Detect Effect Heterogeneity Using Earth Observation and Computer Vision: Applications to Two Anti-Poverty RCTs

Fucheng Warren Zhu, Connor Thomas Jerzak, Adel Daoud

Proceedings of the Fourth Conference on Causal Learning and Reasoning, PMLR 275:894-919, 2025.

Abstract

Earth Observation (EO) data are increasingly used in policy analysis by enabling granular estimation of conditional average treatment effects (CATE). However, a challenge in EO-based causal inference is determining the scale of the input satellite imagery—balancing the trade-off between capturing fine-grained individual heterogeneity in smaller images and broader contextual information in larger ones. This paper introduces Multi-Scale Representation Concatenation, a set of composable procedures that transform arbitrary single-scale EO-based CATE estimation algorithms into multi-scale ones. We benchmark the performance of Multi-Scale Representation Concatenation on a CATE estimation pipeline that combines Vision Transformer (ViT) models (which encode images) with Causal Forests (CFs) to obtain CATE estimates from those encodings. We first perform simulation studies where the causal mechanism is known, showing that our multi-scale approach captures information relevant to effect heterogeneity that single-scale ViT models fail to capture as measured by R-squared. We then apply the multi-scale method to two randomized controlled trials (RCTs) conducted in Peru and Uganda using Landsat satellite imagery. As we do not have access to ground truth CATEs in the RCT analysis, the Rank Average Treatment Effect Ratio (RATE Ratio) measure is employed to assess performance. Results indicate that Multi-Scale Representation Concatenation improves the performance of deep learning models in EO-based CATE estimation without the complexity of designing new multi-scale architectures for a specific use case. The application of Multi-Scale Representation Concatenation could have meaningful policy benefits—e.g., potentially increasing the impact of poverty alleviation programs without additional resource expenditure.

Cite this Paper

BibTeX

@InProceedings{pmlr-v275-zhu25a,
  title = 	 {Optimizing Multi-Scale Representations to Detect Effect Heterogeneity Using Earth Observation and Computer Vision: Applications to Two Anti-Poverty RCTs},
  author =       {Zhu, Fucheng Warren and Jerzak, Connor Thomas and Daoud, Adel},
  booktitle = 	 {Proceedings of the Fourth Conference on Causal Learning and Reasoning},
  pages = 	 {894--919},
  year = 	 {2025},
  editor = 	 {Huang, Biwei and Drton, Mathias},
  volume = 	 {275},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {07--09 May},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v275/main/assets/zhu25a/zhu25a.pdf},
  url = 	 {https://proceedings.mlr.press/v275/zhu25a.html},
  abstract = 	 {Earth Observation (EO) data are increasingly used in policy analysis by enabling granular estimation of conditional average treatment effects (CATE). However, a challenge in EO-based causal inference is determining the scale of the input satellite imagery—balancing the trade-off between capturing fine-grained individual heterogeneity in smaller images and broader contextual information in larger ones. This paper introduces Multi-Scale Representation Concatenation, a set of composable procedures that transform arbitrary single-scale EO-based CATE estimation algorithms into multi-scale ones. We benchmark the performance of Multi-Scale Representation Concatenation on a CATE estimation pipeline that combines Vision Transformer (ViT) models (which encode images) with Causal Forests (CFs) to obtain CATE estimates from those encodings. We first perform simulation studies where the causal mechanism is known, showing that our multi-scale approach captures information relevant to effect heterogeneity that single-scale ViT models fail to capture as measured by R-squared. We then apply the multi-scale method to two randomized controlled trials (RCTs) conducted in Peru and Uganda using Landsat satellite imagery. As we do not have access to ground truth CATEs in the RCT analysis, the Rank Average Treatment Effect Ratio (RATE Ratio) measure is employed to assess performance. Results indicate that Multi-Scale Representation Concatenation improves the performance of deep learning models in EO-based CATE estimation without the complexity of designing new multi-scale architectures for a specific use case. The application of Multi-Scale Representation Concatenation could have meaningful policy benefits—e.g., potentially increasing the impact of poverty alleviation programs without additional resource expenditure.}
}

Endnote

%0 Conference Paper
%T Optimizing Multi-Scale Representations to Detect Effect Heterogeneity Using Earth Observation and Computer Vision: Applications to Two Anti-Poverty RCTs
%A Fucheng Warren Zhu
%A Connor Thomas Jerzak
%A Adel Daoud
%B Proceedings of the Fourth Conference on Causal Learning and Reasoning
%C Proceedings of Machine Learning Research
%D 2025
%E Biwei Huang
%E Mathias Drton	
%F pmlr-v275-zhu25a
%I PMLR
%P 894--919
%U https://proceedings.mlr.press/v275/zhu25a.html
%V 275
%X Earth Observation (EO) data are increasingly used in policy analysis by enabling granular estimation of conditional average treatment effects (CATE). However, a challenge in EO-based causal inference is determining the scale of the input satellite imagery—balancing the trade-off between capturing fine-grained individual heterogeneity in smaller images and broader contextual information in larger ones. This paper introduces Multi-Scale Representation Concatenation, a set of composable procedures that transform arbitrary single-scale EO-based CATE estimation algorithms into multi-scale ones. We benchmark the performance of Multi-Scale Representation Concatenation on a CATE estimation pipeline that combines Vision Transformer (ViT) models (which encode images) with Causal Forests (CFs) to obtain CATE estimates from those encodings. We first perform simulation studies where the causal mechanism is known, showing that our multi-scale approach captures information relevant to effect heterogeneity that single-scale ViT models fail to capture as measured by R-squared. We then apply the multi-scale method to two randomized controlled trials (RCTs) conducted in Peru and Uganda using Landsat satellite imagery. As we do not have access to ground truth CATEs in the RCT analysis, the Rank Average Treatment Effect Ratio (RATE Ratio) measure is employed to assess performance. Results indicate that Multi-Scale Representation Concatenation improves the performance of deep learning models in EO-based CATE estimation without the complexity of designing new multi-scale architectures for a specific use case. The application of Multi-Scale Representation Concatenation could have meaningful policy benefits—e.g., potentially increasing the impact of poverty alleviation programs without additional resource expenditure.

APA

Zhu, F.W., Jerzak, C.T. & Daoud, A.. (2025). Optimizing Multi-Scale Representations to Detect Effect Heterogeneity Using Earth Observation and Computer Vision: Applications to Two Anti-Poverty RCTs. Proceedings of the Fourth Conference on Causal Learning and Reasoning, in Proceedings of Machine Learning Research 275:894-919 Available from https://proceedings.mlr.press/v275/zhu25a.html.

Optimizing Multi-Scale Representations to Detect Effect Heterogeneity Using Earth Observation and Computer Vision: Applications to Two Anti-Poverty RCTs

Abstract

Cite this Paper

Related Material