Density-Aware Personalized Training for Risk Prediction in Imbalanced Medical Data

Zepeng Huo, Xiaoning Qian, Shuai Huang, Zhangyang Wang, Bobak J. Mortazavi
Proceedings of the 7th Machine Learning for Healthcare Conference, PMLR 182:101-122, 2022.

Abstract

Medical events of interest, such as mortality, often happen at a low rate in electronic medical records, as most admitted patients survive. Training models with this imbalance rate (class density discrepancy) may lead to suboptimal prediction. Traditionally this problem is addressed through ad-hoc methods such as resampling or reweighting but performance in many cases is still limited. We propose a framework for training models for this imbalance issue: 1) we first decouple the feature extraction and classification process, adjusting training batches separately for each component to mitigate bias caused by class density discrepancy; 2) we train the network with both a density-aware loss and a learnable cost matrix for misclassifications. We demonstrate our model’s improved performance in real-world medical datasets (TOPCAT and MIMIC-III) to show improved AUC-ROC, AUC-PRC, Brier Skill Score compared with the baselines in the domain.

Cite this Paper


BibTeX
@InProceedings{pmlr-v182-huo22a, title = {Density-Aware Personalized Training for Risk Prediction in Imbalanced Medical Data}, author = {Huo, Zepeng and Qian, Xiaoning and Huang, Shuai and Wang, Zhangyang and Mortazavi, Bobak J.}, booktitle = {Proceedings of the 7th Machine Learning for Healthcare Conference}, pages = {101--122}, year = {2022}, editor = {Lipton, Zachary and Ranganath, Rajesh and Sendak, Mark and Sjoding, Michael and Yeung, Serena}, volume = {182}, series = {Proceedings of Machine Learning Research}, month = {05--06 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v182/huo22a/huo22a.pdf}, url = {https://proceedings.mlr.press/v182/huo22a.html}, abstract = {Medical events of interest, such as mortality, often happen at a low rate in electronic medical records, as most admitted patients survive. Training models with this imbalance rate (class density discrepancy) may lead to suboptimal prediction. Traditionally this problem is addressed through ad-hoc methods such as resampling or reweighting but performance in many cases is still limited. We propose a framework for training models for this imbalance issue: 1) we first decouple the feature extraction and classification process, adjusting training batches separately for each component to mitigate bias caused by class density discrepancy; 2) we train the network with both a density-aware loss and a learnable cost matrix for misclassifications. We demonstrate our model’s improved performance in real-world medical datasets (TOPCAT and MIMIC-III) to show improved AUC-ROC, AUC-PRC, Brier Skill Score compared with the baselines in the domain.} }
Endnote
%0 Conference Paper %T Density-Aware Personalized Training for Risk Prediction in Imbalanced Medical Data %A Zepeng Huo %A Xiaoning Qian %A Shuai Huang %A Zhangyang Wang %A Bobak J. Mortazavi %B Proceedings of the 7th Machine Learning for Healthcare Conference %C Proceedings of Machine Learning Research %D 2022 %E Zachary Lipton %E Rajesh Ranganath %E Mark Sendak %E Michael Sjoding %E Serena Yeung %F pmlr-v182-huo22a %I PMLR %P 101--122 %U https://proceedings.mlr.press/v182/huo22a.html %V 182 %X Medical events of interest, such as mortality, often happen at a low rate in electronic medical records, as most admitted patients survive. Training models with this imbalance rate (class density discrepancy) may lead to suboptimal prediction. Traditionally this problem is addressed through ad-hoc methods such as resampling or reweighting but performance in many cases is still limited. We propose a framework for training models for this imbalance issue: 1) we first decouple the feature extraction and classification process, adjusting training batches separately for each component to mitigate bias caused by class density discrepancy; 2) we train the network with both a density-aware loss and a learnable cost matrix for misclassifications. We demonstrate our model’s improved performance in real-world medical datasets (TOPCAT and MIMIC-III) to show improved AUC-ROC, AUC-PRC, Brier Skill Score compared with the baselines in the domain.
APA
Huo, Z., Qian, X., Huang, S., Wang, Z. & Mortazavi, B.J.. (2022). Density-Aware Personalized Training for Risk Prediction in Imbalanced Medical Data. Proceedings of the 7th Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 182:101-122 Available from https://proceedings.mlr.press/v182/huo22a.html.

Related Material