Limits of Estimating Heterogeneous Treatment Effects: Guidelines for Practical Algorithm Design

[edit]

Ahmed Alaa, Mihaela Schaar ;
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:129-138, 2018.

Abstract

Estimating heterogeneous treatment effects fromobservational data is a central problem in manydomains. Because counterfactual data is inaccessible,the problem differs fundamentally fromsupervised learning, and entails a more complexset of modeling choices. Despite a variety of recentlyproposed algorithmic solutions, a principledguideline for building estimators of treatmenteffects using machine learning algorithmsis still lacking. In this paper, we provide such aguideline by characterizing the fundamental limitsof estimating heterogeneous treatment effects,and establishing conditions under which theselimits can be achieved. Our analysis reveals thatthe relative importance of the different aspectsof observational data vary with the sample size.For instance, we show that selection bias mattersonly in small-sample regimes, whereas witha large sample size, the way an algorithm modelsthe control and treated outcomes is what bottlenecksits performance. Guided by our analysis,we build a practical algorithm for estimatingtreatment effects using a non-stationary Gaussianprocesses with doubly-robust hyperparameters.Using a standard semi-synthetic simulationsetup, we show that our algorithm outperformsthe state-of-the-art, and that the behavior of existingalgorithms conforms with our analysis.

Related Material