[edit]
Hyperparameter Tuning for Causal Inference with Double Machine Learning: A Simulation Study
Proceedings of the Third Conference on Causal Learning and Reasoning, PMLR 236:1065-1117, 2024.
Abstract
Proper hyperparameter tuning is essential for achieving optimal performance of modern machine learning (ML) methods in predictive tasks. While there is an extensive literature on tuning ML learners for prediction, there is only littleguidance available on tuning ML learners for causal machine learning. In this paper, we investigate the role of hyperparameter tuning and other practical decisions for causal estimation as based on the Double Machine Learning approach by Chernozhukov et al. (2018). Double Machine Learning relies on estimating so-called nuisance parameters by treating them as supervised learning problems and using them as plug-in estimates to solve for the (causal) parameter. We conduct an extensive simulation study using data from the 2019 Atlantic Causal Inference Conference Data Challenge. We provide empirical insights on the selection and calibration of the ML methods for the performance of causal estimation. First, we assess the importance of data splitting schemes for tuning ML learners within Double Machine Learning. Second, we investigate the choice of ML methods and hyperparameters, including recent AutoML frameworks, and consider the relationship between their predictive power and the estimation performance for a causal parameter of interest. Third, we investigate to what extent the choice of a particular causal model, as characterized by incorporated parametric assumptions, can be based on predictive performance metrics.