The Impact of Image Resolution on Biomedical Multimodal Large Language Models

Liangyu Chen, James Burgess, Jeffrey J Nirschl, Orr Zohar, Serena Yeung-Levy
Proceedings of the 10th Machine Learning for Healthcare Conference, PMLR 298, 2025.

Abstract

Imaging technologies are fundamental to biomedical research and modern medicine, requiring analysis of high-resolution images across various modalities. While multimodal large language models (MLLMs) show promise for biomedical image analysis, most are designed for low-resolution images from general-purpose datasets, risking critical information loss. We investigate how image resolution affects MLLM performance in biomedical applications and demonstrate that: (1) native-resolution training and inference significantly improve performance across multiple tasks, (2) misalignment between training and inference resolutions severely degrades performance, and (3) mixed-resolution training effectively mitigates misalignment and balances computational constraints with performance requirements. Based on these findings, we recommend prioritizing native-resolution inference and mixed-resolution datasets to optimize biomedical MLLMs for transformative impact in scientific research and clinical applications.

Cite this Paper


BibTeX
@InProceedings{pmlr-v298-chen25a, title = {The Impact of Image Resolution on Biomedical Multimodal Large Language Models}, author = {Chen, Liangyu and Burgess, James and Nirschl, Jeffrey J and Zohar, Orr and Yeung-Levy, Serena}, booktitle = {Proceedings of the 10th Machine Learning for Healthcare Conference}, year = {2025}, editor = {Agrawal, Monica and Deshpande, Kaivalya and Engelhard, Matthew and Joshi, Shalmali and Tang, Shengpu and Urteaga, Iñigo}, volume = {298}, series = {Proceedings of Machine Learning Research}, month = {15--16 Aug}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v298/main/assets/chen25a/chen25a.pdf}, url = {https://proceedings.mlr.press/v298/chen25a.html}, abstract = {Imaging technologies are fundamental to biomedical research and modern medicine, requiring analysis of high-resolution images across various modalities. While multimodal large language models (MLLMs) show promise for biomedical image analysis, most are designed for low-resolution images from general-purpose datasets, risking critical information loss. We investigate how image resolution affects MLLM performance in biomedical applications and demonstrate that: (1) native-resolution training and inference significantly improve performance across multiple tasks, (2) misalignment between training and inference resolutions severely degrades performance, and (3) mixed-resolution training effectively mitigates misalignment and balances computational constraints with performance requirements. Based on these findings, we recommend prioritizing native-resolution inference and mixed-resolution datasets to optimize biomedical MLLMs for transformative impact in scientific research and clinical applications.} }
Endnote
%0 Conference Paper %T The Impact of Image Resolution on Biomedical Multimodal Large Language Models %A Liangyu Chen %A James Burgess %A Jeffrey J Nirschl %A Orr Zohar %A Serena Yeung-Levy %B Proceedings of the 10th Machine Learning for Healthcare Conference %C Proceedings of Machine Learning Research %D 2025 %E Monica Agrawal %E Kaivalya Deshpande %E Matthew Engelhard %E Shalmali Joshi %E Shengpu Tang %E Iñigo Urteaga %F pmlr-v298-chen25a %I PMLR %U https://proceedings.mlr.press/v298/chen25a.html %V 298 %X Imaging technologies are fundamental to biomedical research and modern medicine, requiring analysis of high-resolution images across various modalities. While multimodal large language models (MLLMs) show promise for biomedical image analysis, most are designed for low-resolution images from general-purpose datasets, risking critical information loss. We investigate how image resolution affects MLLM performance in biomedical applications and demonstrate that: (1) native-resolution training and inference significantly improve performance across multiple tasks, (2) misalignment between training and inference resolutions severely degrades performance, and (3) mixed-resolution training effectively mitigates misalignment and balances computational constraints with performance requirements. Based on these findings, we recommend prioritizing native-resolution inference and mixed-resolution datasets to optimize biomedical MLLMs for transformative impact in scientific research and clinical applications.
APA
Chen, L., Burgess, J., Nirschl, J.J., Zohar, O. & Yeung-Levy, S.. (2025). The Impact of Image Resolution on Biomedical Multimodal Large Language Models. Proceedings of the 10th Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 298 Available from https://proceedings.mlr.press/v298/chen25a.html.

Related Material