MedFuse: Multi-modal fusion with clinical time-series data and chest X-ray images

Nasir Hayat; Krzysztof J. Geras; Farah E. Shamout

MedFuse: Multi-modal fusion with clinical time-series data and chest X-ray images

Nasir Hayat, Krzysztof J. Geras, Farah E. Shamout

Proceedings of the 7th Machine Learning for Healthcare Conference, PMLR 182:479-503, 2022.

Abstract

Multi-modal fusion approaches aim to integrate information from different data sources. Unlike natural datasets, such as in audio-visual applications, where samples consist of “paired” modalities, data in healthcare is often collected asynchronously. Hence, requiring the presence of all modalities for a given sample is not realistic for clinical tasks and significantly limits the size of the dataset during training. In this paper, we propose MedFuse, a conceptually simple yet promising LSTM-based fusion module that can accommodate uni-modal as well as multi-modal input. We evaluate the fusion method and introduce new benchmark results for in-hospital mortality prediction and phenotype classification, using clinical time-series data in the MIMIC-IV dataset and corresponding chest X-ray images in MIMIC-CXR. Compared to more complex multi-modal fusion strategies, MedFuse provides a performance improvement by a large margin on the fully paired test set. It also remains robust across the partially paired test set containing samples with missing chest X-ray images. We release our code for reproducibility and to enable the evaluation of competing models in the future.

Cite this Paper

BibTeX


@InProceedings{pmlr-v182-hayat22a,
  title = 	 {MedFuse: Multi-modal fusion with clinical time-series data and chest X-ray images},
  author =       {Hayat, Nasir and Geras, Krzysztof J. and Shamout, Farah E.},
  booktitle = 	 {Proceedings of the 7th Machine Learning for Healthcare Conference},
  pages = 	 {479--503},
  year = 	 {2022},
  editor = 	 {Lipton, Zachary and Ranganath, Rajesh and Sendak, Mark and Sjoding, Michael and Yeung, Serena},
  volume = 	 {182},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {05--06 Aug},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v182/hayat22a/hayat22a.pdf},
  url = 	 {https://proceedings.mlr.press/v182/hayat22a.html},
  abstract = 	 {Multi-modal fusion approaches aim to integrate information from different data sources. Unlike natural datasets, such as in audio-visual applications, where samples consist of “paired” modalities, data in healthcare is often collected asynchronously. Hence, requiring the presence of all modalities for a given sample is not realistic for clinical tasks and significantly limits the size of the dataset during training. In this paper, we propose MedFuse, a conceptually simple yet promising LSTM-based fusion module that can accommodate uni-modal as well as multi-modal input. We evaluate the fusion method and introduce new benchmark results for in-hospital mortality prediction and phenotype classification, using clinical time-series data in the MIMIC-IV dataset and corresponding chest X-ray images in MIMIC-CXR. Compared to more complex multi-modal fusion strategies, MedFuse provides a performance improvement by a large margin on the fully paired test set. It also remains robust across the partially paired test set containing samples with missing chest X-ray images. We release our code for reproducibility and to enable the evaluation of competing models in the future.}
}

Endnote

%0 Conference Paper
%T MedFuse: Multi-modal fusion with clinical time-series data and chest X-ray images
%A Nasir Hayat
%A Krzysztof J. Geras
%A Farah E. Shamout
%B Proceedings of the 7th Machine Learning for Healthcare Conference
%C Proceedings of Machine Learning Research
%D 2022
%E Zachary Lipton
%E Rajesh Ranganath
%E Mark Sendak
%E Michael Sjoding
%E Serena Yeung	
%F pmlr-v182-hayat22a
%I PMLR
%P 479--503
%U https://proceedings.mlr.press/v182/hayat22a.html
%V 182
%X Multi-modal fusion approaches aim to integrate information from different data sources. Unlike natural datasets, such as in audio-visual applications, where samples consist of “paired” modalities, data in healthcare is often collected asynchronously. Hence, requiring the presence of all modalities for a given sample is not realistic for clinical tasks and significantly limits the size of the dataset during training. In this paper, we propose MedFuse, a conceptually simple yet promising LSTM-based fusion module that can accommodate uni-modal as well as multi-modal input. We evaluate the fusion method and introduce new benchmark results for in-hospital mortality prediction and phenotype classification, using clinical time-series data in the MIMIC-IV dataset and corresponding chest X-ray images in MIMIC-CXR. Compared to more complex multi-modal fusion strategies, MedFuse provides a performance improvement by a large margin on the fully paired test set. It also remains robust across the partially paired test set containing samples with missing chest X-ray images. We release our code for reproducibility and to enable the evaluation of competing models in the future.

APA


Hayat, N., Geras, K.J. & Shamout, F.E.. (2022). MedFuse: Multi-modal fusion with clinical time-series data and chest X-ray images. Proceedings of the 7th Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 182:479-503 Available from https://proceedings.mlr.press/v182/hayat22a.html.

Related Material

Download PDF