MedMod: Multimodal Benchmark for Medical Prediction Tasks with Electronic Health Records and Chest X-Ray Scans

Shaza Elsharief, Saeed Shurrab, Baraa Al Jorf, Leopoldo Julian Lechuga Lopez, Krzysztof J. Geras, Farah E. Shamout
Proceedings of the sixth Conference on Health, Inference, and Learning, PMLR 287:781-803, 2025.

Abstract

Multimodal machine learning provides a myriad of opportunities for developing models that integrate multiple modalities and mimic decision-making in the real-world, such as in medical settings. However, benchmarks involving multimodal medical data are scarce, especially routinely collected modalities such as Electronic Health Records (EHR) and Chest X-ray images (CXR). To contribute towards advancing multimodal learning in tackling real-world prediction tasks, we present MedMod, a multimodal medical benchmark with EHR and CXR using publicly available datasets MIMIC-IV and MIMIC-CXR, respectively. MedMod comprises five clinical prediction tasks: clinical conditions, in-hospital mortality, decompensation, length of stay, and radiological findings. We extensively evaluate several multimodal supervised learning models and self-supervised learning frameworks, making all of our code and models open-source.

Cite this Paper


BibTeX
@InProceedings{pmlr-v287-elsharief25a, title = {MedMod: Multimodal Benchmark for Medical Prediction Tasks with Electronic Health Records and Chest X-Ray Scans}, author = {Elsharief, Shaza and Shurrab, Saeed and Jorf, Baraa Al and Lopez, Leopoldo Julian Lechuga and Geras, Krzysztof J. and Shamout, Farah E.}, booktitle = {Proceedings of the sixth Conference on Health, Inference, and Learning}, pages = {781--803}, year = {2025}, editor = {Xu, Xuhai Orson and Choi, Edward and Singhal, Pankhuri and Gerych, Walter and Tang, Shengpu and Agrawal, Monica and Subbaswamy, Adarsh and Sizikova, Elena and Dunn, Jessilyn and Daneshjou, Roxana and Sarker, Tasmie and McDermott, Matthew and Chen, Irene}, volume = {287}, series = {Proceedings of Machine Learning Research}, month = {25--27 Jun}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v287/main/assets/elsharief25a/elsharief25a.pdf}, url = {https://proceedings.mlr.press/v287/elsharief25a.html}, abstract = {Multimodal machine learning provides a myriad of opportunities for developing models that integrate multiple modalities and mimic decision-making in the real-world, such as in medical settings. However, benchmarks involving multimodal medical data are scarce, especially routinely collected modalities such as Electronic Health Records (EHR) and Chest X-ray images (CXR). To contribute towards advancing multimodal learning in tackling real-world prediction tasks, we present MedMod, a multimodal medical benchmark with EHR and CXR using publicly available datasets MIMIC-IV and MIMIC-CXR, respectively. MedMod comprises five clinical prediction tasks: clinical conditions, in-hospital mortality, decompensation, length of stay, and radiological findings. We extensively evaluate several multimodal supervised learning models and self-supervised learning frameworks, making all of our code and models open-source.} }
Endnote
%0 Conference Paper %T MedMod: Multimodal Benchmark for Medical Prediction Tasks with Electronic Health Records and Chest X-Ray Scans %A Shaza Elsharief %A Saeed Shurrab %A Baraa Al Jorf %A Leopoldo Julian Lechuga Lopez %A Krzysztof J. Geras %A Farah E. Shamout %B Proceedings of the sixth Conference on Health, Inference, and Learning %C Proceedings of Machine Learning Research %D 2025 %E Xuhai Orson Xu %E Edward Choi %E Pankhuri Singhal %E Walter Gerych %E Shengpu Tang %E Monica Agrawal %E Adarsh Subbaswamy %E Elena Sizikova %E Jessilyn Dunn %E Roxana Daneshjou %E Tasmie Sarker %E Matthew McDermott %E Irene Chen %F pmlr-v287-elsharief25a %I PMLR %P 781--803 %U https://proceedings.mlr.press/v287/elsharief25a.html %V 287 %X Multimodal machine learning provides a myriad of opportunities for developing models that integrate multiple modalities and mimic decision-making in the real-world, such as in medical settings. However, benchmarks involving multimodal medical data are scarce, especially routinely collected modalities such as Electronic Health Records (EHR) and Chest X-ray images (CXR). To contribute towards advancing multimodal learning in tackling real-world prediction tasks, we present MedMod, a multimodal medical benchmark with EHR and CXR using publicly available datasets MIMIC-IV and MIMIC-CXR, respectively. MedMod comprises five clinical prediction tasks: clinical conditions, in-hospital mortality, decompensation, length of stay, and radiological findings. We extensively evaluate several multimodal supervised learning models and self-supervised learning frameworks, making all of our code and models open-source.
APA
Elsharief, S., Shurrab, S., Jorf, B.A., Lopez, L.J.L., Geras, K.J. & Shamout, F.E.. (2025). MedMod: Multimodal Benchmark for Medical Prediction Tasks with Electronic Health Records and Chest X-Ray Scans. Proceedings of the sixth Conference on Health, Inference, and Learning, in Proceedings of Machine Learning Research 287:781-803 Available from https://proceedings.mlr.press/v287/elsharief25a.html.

Related Material