Unified Auto Clinical Scoring (Uni-ACS) with Interpretable ML models
Proceedings of the 7th Machine Learning for Healthcare Conference, PMLR 182:26-53, 2022.
Despite significant progress in explainable Machine Learning (ML) tools (such as LIME, SHAP and explainable boosting machines) in explaining ML models’ risk predictions in clinical problems (such as heart failure, acute kidney injury, sepsis and hypoxaemia during surgery), the interpretations generated remain to be an unfamiliar language to most clinicians. Clinical scores continue to be the preferred tool for risk stratification as they are concise, clinically correlatable and can be used at patient’s bedside without a machine. In this work, we reproduce the classical clinical scoring development approach to uncover its limitations in determining categorical features and using logistic regression coefficients to derive additive integer scoring systems. Subsequently, we propose the Unified Automatic Clinical Scoring (Uni-ACS) development framework, which overcomes these limitations to translating ML models into clinical scores by leveraging on explainable outputs from SHAP compatible ML models. We hypothesize that this approach is model agnostic, can be automated and can retain the complex predictive power of the underlying ML model, while relating key model insights to clinicians in a clinical risk scoring format. In our experiments, we applied Uni-ACS to a variety of ML models trained on the MIMIC III and MIMIC IV sepsis cohorts to predict mortality and ICU admission. We showed that Uni-ACS derived clinical score retained a greater proportion of the underlying ML models’ predictive performance (lowest AUROC drop of 2.44%), compared against the baseline clinical score (lowest AUROC drop of 5.79%). We further verified Uni-ACS clinical score’s insights against the current literature to show its clinical applicability. Uni-ACS and datasets used for method validation are open-sourced for the community to use and verify.