Common Event Tethering to Improve Prediction of Rare Clinical Events

Quinn Lanners, Qin Weng, Marie-Louise Meng, Matthew M. Engelhard
Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, PMLR 244:2136-2162, 2024.

Abstract

Learning to predict rare medical events is difficult due to the inherent lack of signal in highly imbalanced datasets. Yet, oftentimes we also have access to surrogate or related outcomes that we believe share etiology or underlying risk factors with the event of interest. In this work, we propose the use of two variants of a well-known approach, regularized multi-label learning (MLL), that we hypothesize are uniquely suited to leverage this similarity and improve model performance in rare event settings. Whereas most analyses of MLL emphasize improved performance across all event types, our analyses quantify benefits to rare event prediction offered by our approach when a more common, related event is available to enhance learning. We begin by deriving asymptotic properties and providing theoretical insight into the convergence rates of our proposed estimators. We then provide simulation results highlighting how characteristics of the data generating process, including the event similarity and event rate, affect our proposed models’ performance. We conclude by showing real-world benefit of our approach in two clinical settings: prediction of rare cardiovascular morbidities in the setting of preeclampsia; and early prediction of autism from the electronic health record.

Cite this Paper


BibTeX
@InProceedings{pmlr-v244-lanners24a, title = {Common Event Tethering to Improve Prediction of Rare Clinical Events}, author = {Lanners, Quinn and Weng, Qin and Meng, Marie-Louise and Engelhard, Matthew M.}, booktitle = {Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence}, pages = {2136--2162}, year = {2024}, editor = {Kiyavash, Negar and Mooij, Joris M.}, volume = {244}, series = {Proceedings of Machine Learning Research}, month = {15--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v244/main/assets/lanners24a/lanners24a.pdf}, url = {https://proceedings.mlr.press/v244/lanners24a.html}, abstract = {Learning to predict rare medical events is difficult due to the inherent lack of signal in highly imbalanced datasets. Yet, oftentimes we also have access to surrogate or related outcomes that we believe share etiology or underlying risk factors with the event of interest. In this work, we propose the use of two variants of a well-known approach, regularized multi-label learning (MLL), that we hypothesize are uniquely suited to leverage this similarity and improve model performance in rare event settings. Whereas most analyses of MLL emphasize improved performance across all event types, our analyses quantify benefits to rare event prediction offered by our approach when a more common, related event is available to enhance learning. We begin by deriving asymptotic properties and providing theoretical insight into the convergence rates of our proposed estimators. We then provide simulation results highlighting how characteristics of the data generating process, including the event similarity and event rate, affect our proposed models’ performance. We conclude by showing real-world benefit of our approach in two clinical settings: prediction of rare cardiovascular morbidities in the setting of preeclampsia; and early prediction of autism from the electronic health record.} }
Endnote
%0 Conference Paper %T Common Event Tethering to Improve Prediction of Rare Clinical Events %A Quinn Lanners %A Qin Weng %A Marie-Louise Meng %A Matthew M. Engelhard %B Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2024 %E Negar Kiyavash %E Joris M. Mooij %F pmlr-v244-lanners24a %I PMLR %P 2136--2162 %U https://proceedings.mlr.press/v244/lanners24a.html %V 244 %X Learning to predict rare medical events is difficult due to the inherent lack of signal in highly imbalanced datasets. Yet, oftentimes we also have access to surrogate or related outcomes that we believe share etiology or underlying risk factors with the event of interest. In this work, we propose the use of two variants of a well-known approach, regularized multi-label learning (MLL), that we hypothesize are uniquely suited to leverage this similarity and improve model performance in rare event settings. Whereas most analyses of MLL emphasize improved performance across all event types, our analyses quantify benefits to rare event prediction offered by our approach when a more common, related event is available to enhance learning. We begin by deriving asymptotic properties and providing theoretical insight into the convergence rates of our proposed estimators. We then provide simulation results highlighting how characteristics of the data generating process, including the event similarity and event rate, affect our proposed models’ performance. We conclude by showing real-world benefit of our approach in two clinical settings: prediction of rare cardiovascular morbidities in the setting of preeclampsia; and early prediction of autism from the electronic health record.
APA
Lanners, Q., Weng, Q., Meng, M. & Engelhard, M.M.. (2024). Common Event Tethering to Improve Prediction of Rare Clinical Events. Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 244:2136-2162 Available from https://proceedings.mlr.press/v244/lanners24a.html.

Related Material