How to validate Machine Learning Models Prior to Deployment: Silent trial protocol for evaluation of real-time models at ICU

Sana Tonekaboni, Gabriela Morgenshtern, Azadeh Assadi, Aslesha Pokhrel, Xi Huang, Anand Jayarajan, Robert Greer, Gennady Pekhimenko, Melissa McCradden, Fanny Chevalier, Mjaye Mazwi, Anna Goldenberg
Proceedings of the Conference on Health, Inference, and Learning, PMLR 174:169-182, 2022.

Abstract

Rigorous evaluation of ML models prior to deployment in hospital settings is critical to ensure utility, performance, and safety. In addition, a guarantee of the usability of such tools requires careful user-centred design and evaluation. Such evaluations can be extra challenging for models that measure unquantified and complex clinical phenomena like the risk of deterioration. This paper introduces a silent trial protocol for evaluating models in real-time in the ICU setting. The trial is designed following principles of formative testing with the goal of evaluating model performance and gathering information that can be used to refine the model to best fit within the intended environment of deployment. We highlight the considerations for a systematic evaluation and explain the design and deployment of the components that enable this trial. We hope that the principles and considerations introduced in this paper can help other researchers validate ML models in their clinical settings.

Cite this Paper


BibTeX
@InProceedings{pmlr-v174-tonekaboni22a, title = {How to validate Machine Learning Models Prior to Deployment: Silent trial protocol for evaluation of real-time models at ICU}, author = {Tonekaboni, Sana and Morgenshtern, Gabriela and Assadi, Azadeh and Pokhrel, Aslesha and Huang, Xi and Jayarajan, Anand and Greer, Robert and Pekhimenko, Gennady and McCradden, Melissa and Chevalier, Fanny and Mazwi, Mjaye and Goldenberg, Anna}, booktitle = {Proceedings of the Conference on Health, Inference, and Learning}, pages = {169--182}, year = {2022}, editor = {Flores, Gerardo and Chen, George H and Pollard, Tom and Ho, Joyce C and Naumann, Tristan}, volume = {174}, series = {Proceedings of Machine Learning Research}, month = {07--08 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v174/tonekaboni22a/tonekaboni22a.pdf}, url = {https://proceedings.mlr.press/v174/tonekaboni22a.html}, abstract = {Rigorous evaluation of ML models prior to deployment in hospital settings is critical to ensure utility, performance, and safety. In addition, a guarantee of the usability of such tools requires careful user-centred design and evaluation. Such evaluations can be extra challenging for models that measure unquantified and complex clinical phenomena like the risk of deterioration. This paper introduces a silent trial protocol for evaluating models in real-time in the ICU setting. The trial is designed following principles of formative testing with the goal of evaluating model performance and gathering information that can be used to refine the model to best fit within the intended environment of deployment. We highlight the considerations for a systematic evaluation and explain the design and deployment of the components that enable this trial. We hope that the principles and considerations introduced in this paper can help other researchers validate ML models in their clinical settings.} }
Endnote
%0 Conference Paper %T How to validate Machine Learning Models Prior to Deployment: Silent trial protocol for evaluation of real-time models at ICU %A Sana Tonekaboni %A Gabriela Morgenshtern %A Azadeh Assadi %A Aslesha Pokhrel %A Xi Huang %A Anand Jayarajan %A Robert Greer %A Gennady Pekhimenko %A Melissa McCradden %A Fanny Chevalier %A Mjaye Mazwi %A Anna Goldenberg %B Proceedings of the Conference on Health, Inference, and Learning %C Proceedings of Machine Learning Research %D 2022 %E Gerardo Flores %E George H Chen %E Tom Pollard %E Joyce C Ho %E Tristan Naumann %F pmlr-v174-tonekaboni22a %I PMLR %P 169--182 %U https://proceedings.mlr.press/v174/tonekaboni22a.html %V 174 %X Rigorous evaluation of ML models prior to deployment in hospital settings is critical to ensure utility, performance, and safety. In addition, a guarantee of the usability of such tools requires careful user-centred design and evaluation. Such evaluations can be extra challenging for models that measure unquantified and complex clinical phenomena like the risk of deterioration. This paper introduces a silent trial protocol for evaluating models in real-time in the ICU setting. The trial is designed following principles of formative testing with the goal of evaluating model performance and gathering information that can be used to refine the model to best fit within the intended environment of deployment. We highlight the considerations for a systematic evaluation and explain the design and deployment of the components that enable this trial. We hope that the principles and considerations introduced in this paper can help other researchers validate ML models in their clinical settings.
APA
Tonekaboni, S., Morgenshtern, G., Assadi, A., Pokhrel, A., Huang, X., Jayarajan, A., Greer, R., Pekhimenko, G., McCradden, M., Chevalier, F., Mazwi, M. & Goldenberg, A.. (2022). How to validate Machine Learning Models Prior to Deployment: Silent trial protocol for evaluation of real-time models at ICU. Proceedings of the Conference on Health, Inference, and Learning, in Proceedings of Machine Learning Research 174:169-182 Available from https://proceedings.mlr.press/v174/tonekaboni22a.html.

Related Material