[edit]
A Systematic Review of Causal Machine Learning Approaches in Road Crash Analysis
Proceedings of IndabaX Nigeria 2026: Building Scalable AI That Works: From Research to Deployment in Resource-Constrained Environments, PMLR 319:382-405, 2026.
Abstract
Although machine learning (ML) has significantly increased the predictive accuracy of road crash severity and frequency models, traditional predictive classifiers and consequent interpretability tools have often failed to distinguish between correlation and causation. Such approaches do not have the counterfactual reasoning needed in sound policy development. To explore the paradigm shift of formal causal inference in traffic safety, this review adhered to PRISMA 2020 and synthesised 35 peer-reviewed articles published between 2021 and 2025. The synthesis classifies the literature into a three-level taxonomy of Predictive ML, Interpretable ML, and Causal ML, showing that most existing research remains rooted in purely predictive ensembles or explainability tools such as SHAP. A small but tightly developed body of work executes true causal ML methods: Doubly Robust Learning, Uplift Modelling, and Causal Graph Discovery are effective in determining heterogeneous treatment effects (HTE) and mitigating confounding bias in observational crash data. Critical methodological gaps persist, including the continued conflation of predictive feature importance with causal effect, sensitivity to unobserved heterogeneity, and the absence of standardised causal benchmarks. Comprehensive sensitivity analyses and integration of structural causal models are identified as prerequisites for maturing Intelligent Transportation Systems (ITS) toward proactive and evidence-based safety interventions.