[edit]
How to validate Machine Learning Models Prior to Deployment: Silent trial protocol for evaluation of real-time models at ICU
Proceedings of the Conference on Health, Inference, and Learning, PMLR 174:169-182, 2022.
Abstract
Rigorous evaluation of ML models prior to deployment in hospital settings is critical to ensure utility, performance, and safety. In addition, a guarantee of the usability of such tools requires careful user-centred design and evaluation. Such evaluations can be extra challenging for models that measure unquantified and complex clinical phenomena like the risk of deterioration. This paper introduces a silent trial protocol for evaluating models in real-time in the ICU setting. The trial is designed following principles of formative testing with the goal of evaluating model performance and gathering information that can be used to refine the model to best fit within the intended environment of deployment. We highlight the considerations for a systematic evaluation and explain the design and deployment of the components that enable this trial. We hope that the principles and considerations introduced in this paper can help other researchers validate ML models in their clinical settings.