Investigating General-Purpose Large Language Models for Patient Information Extraction: A Case Study on Real-World Cardiac MRI Reports

Sebin Sabu; Pavithra Rajendran; Ewart Jonny Sheldon; Alexandros Zenonos; Shiren Patel; Andrew Taylor; Rebecca Pope; Neil Sebire

Investigating General-Purpose Large Language Models for Patient Information Extraction: A Case Study on Real-World Cardiac MRI Reports

Sebin Sabu, Pavithra Rajendran, Ewart Jonny Sheldon, Alexandros Zenonos, Shiren Patel, Andrew Taylor, Rebecca Pope, Neil Sebire

Proceedings of The First AAAI Bridge Program on AI for Medicine and Healthcare, PMLR 281:63-69, 2025.

Abstract

Electronic Patient Record (EPR) systems within healthcare systems contains a significant volume of free text written by clinicians in the form of unstructured data, meaning access to timely, potential pertinent data signals is precluded. For a clinician to analyse information for a cohort of patients for research, information extracted from unstructured data needs to be mapped with the routinely collected standard structured information and this can require lot of manual work and time. This paper studies the potential capabilities of general-purpose Large Language Models (LLMs) in the context of, (1) practical deployment using limited CPU computing resources, (2) usefulness in the context of extracting patient information within healthcare settings and (3) does not require fine-tuning or train models from scratch. In particular, we have investigated the utility of prompt-based zero-shot predictions by adapting these models in a question answering framework, which is deployed and run within a secure on-premise environment with CPU servers for extracting ten years of retrospective data containing 15,376 Cardiac MRI reports. Results are evaluated on a ground-truth dataset containing 400 randomly selected reports across the ten year period with the best performance having an averaged F1-score of 97.83%. Source code will be made available upon acceptance.

Cite this Paper

BibTeX

@InProceedings{pmlr-v281-sabu25a,
  title = 	 {Investigating General-Purpose Large Language Models for Patient Information Extraction: A Case Study on Real-World Cardiac MRI Reports},
  author =       {Sabu, Sebin and Rajendran, Pavithra and Sheldon, Ewart Jonny and Zenonos, Alexandros and Patel, Shiren and Taylor, Andrew and Pope, Rebecca and Sebire, Neil},
  booktitle = 	 {Proceedings of The First AAAI Bridge Program on AI for Medicine and Healthcare},
  pages = 	 {63--69},
  year = 	 {2025},
  editor = 	 {Wu, Junde and Zhu, Jiayuan and Xu, Min and Jin, Yueming},
  volume = 	 {281},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25 Feb},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v281/main/assets/sabu25a/sabu25a.pdf},
  url = 	 {https://proceedings.mlr.press/v281/sabu25a.html},
  abstract = 	 {Electronic Patient Record (EPR) systems within healthcare systems contains a significant volume of free text written by clinicians in the form of unstructured data, meaning access to timely, potential pertinent data signals is precluded. For a clinician to analyse information for a cohort of patients for research, information extracted from unstructured data needs to be mapped with the routinely collected standard structured information and this can require lot of manual work and time. This paper studies the potential capabilities of general-purpose Large Language Models (LLMs) in the context of, (1) practical deployment using limited CPU computing resources, (2) usefulness in the context of extracting patient information within healthcare settings and (3) does not require fine-tuning or train models from scratch. In particular, we have investigated the utility of prompt-based zero-shot predictions by adapting these models in a question answering framework, which is deployed and run within a secure on-premise environment with CPU servers for extracting ten years of retrospective data containing 15,376 Cardiac MRI reports. Results are evaluated on a ground-truth dataset containing 400 randomly selected reports across the ten year period with the best performance having an averaged F1-score of 97.83%. Source code will be made available upon acceptance.}
}

Endnote

%0 Conference Paper
%T Investigating General-Purpose Large Language Models for Patient Information Extraction: A Case Study on Real-World Cardiac MRI Reports
%A Sebin Sabu
%A Pavithra Rajendran
%A Ewart Jonny Sheldon
%A Alexandros Zenonos
%A Shiren Patel
%A Andrew Taylor
%A Rebecca Pope
%A Neil Sebire
%B Proceedings of The First AAAI Bridge Program on AI for Medicine and Healthcare
%C Proceedings of Machine Learning Research
%D 2025
%E Junde Wu
%E Jiayuan Zhu
%E Min Xu
%E Yueming Jin	
%F pmlr-v281-sabu25a
%I PMLR
%P 63--69
%U https://proceedings.mlr.press/v281/sabu25a.html
%V 281
%X Electronic Patient Record (EPR) systems within healthcare systems contains a significant volume of free text written by clinicians in the form of unstructured data, meaning access to timely, potential pertinent data signals is precluded. For a clinician to analyse information for a cohort of patients for research, information extracted from unstructured data needs to be mapped with the routinely collected standard structured information and this can require lot of manual work and time. This paper studies the potential capabilities of general-purpose Large Language Models (LLMs) in the context of, (1) practical deployment using limited CPU computing resources, (2) usefulness in the context of extracting patient information within healthcare settings and (3) does not require fine-tuning or train models from scratch. In particular, we have investigated the utility of prompt-based zero-shot predictions by adapting these models in a question answering framework, which is deployed and run within a secure on-premise environment with CPU servers for extracting ten years of retrospective data containing 15,376 Cardiac MRI reports. Results are evaluated on a ground-truth dataset containing 400 randomly selected reports across the ten year period with the best performance having an averaged F1-score of 97.83%. Source code will be made available upon acceptance.

APA

Sabu, S., Rajendran, P., Sheldon, E.J., Zenonos, A., Patel, S., Taylor, A., Pope, R. & Sebire, N.. (2025). Investigating General-Purpose Large Language Models for Patient Information Extraction: A Case Study on Real-World Cardiac MRI Reports. Proceedings of The First AAAI Bridge Program on AI for Medicine and Healthcare, in Proceedings of Machine Learning Research 281:63-69 Available from https://proceedings.mlr.press/v281/sabu25a.html.

Related Material

Download PDF