ALO: Addressing Class Imbalance in Radiology Report Generation through Anatomy-Level Oversampling

Lukas Buess, Robert Kurin, Adarsh Bhandary Panambur, Tomas Arias-Vergara, Andreas Maier
Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, PMLR 315:2998-3017, 2026.

Abstract

Radiology report generation aims to connect visual understanding with clinical language, yet most methods rely on free-text supervision, which is highly variable and difficult to evaluate. Clinical datasets are also dominated by normal findings, causing models to underreport abnormalities. While recent works focus on architectural advances, we show that structured supervision and balanced sampling can yield substantial gains in clinical performance. We convert free-text reports into structured anatomy-level representations and introduce Anatomy-Level Oversampling (ALO), a data centered sampling strategy that balances normal and abnormal findings for each anatomical region. This structure provides consistent supervision and enables more informative evaluation. Across three public datasets, ALO improves sensitivity to pathological findings while remaining fully model agnostic. On internal validation, ALO increases F1-Score by 50% and CRG by 5.8%, and on external validation, it increases F1-Score by 45.1% and CRG by 5%. These results highlight the importance of structured data and balanced sampling for reliable report generation.

Cite this Paper


BibTeX
@InProceedings{pmlr-v315-buess26a, title = {ALO: Addressing Class Imbalance in Radiology Report Generation through Anatomy-Level Oversampling}, author = {Buess, Lukas and Kurin, Robert and Bhandary Panambur, Adarsh and Arias-Vergara, Tomas and Maier, Andreas}, booktitle = {Proceedings of The 9th International Conference on Medical Imaging with Deep Learning}, pages = {2998--3017}, year = {2026}, editor = {Huo, Yuankai and Gao, Mingchen and Kuo, Chang-Fu and Jin, Yueming and Deng, Ruining}, volume = {315}, series = {Proceedings of Machine Learning Research}, month = {08--10 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v315/main/assets/buess26a/buess26a.pdf}, url = {https://proceedings.mlr.press/v315/buess26a.html}, abstract = {Radiology report generation aims to connect visual understanding with clinical language, yet most methods rely on free-text supervision, which is highly variable and difficult to evaluate. Clinical datasets are also dominated by normal findings, causing models to underreport abnormalities. While recent works focus on architectural advances, we show that structured supervision and balanced sampling can yield substantial gains in clinical performance. We convert free-text reports into structured anatomy-level representations and introduce Anatomy-Level Oversampling (ALO), a data centered sampling strategy that balances normal and abnormal findings for each anatomical region. This structure provides consistent supervision and enables more informative evaluation. Across three public datasets, ALO improves sensitivity to pathological findings while remaining fully model agnostic. On internal validation, ALO increases F1-Score by 50% and CRG by 5.8%, and on external validation, it increases F1-Score by 45.1% and CRG by 5%. These results highlight the importance of structured data and balanced sampling for reliable report generation.} }
Endnote
%0 Conference Paper %T ALO: Addressing Class Imbalance in Radiology Report Generation through Anatomy-Level Oversampling %A Lukas Buess %A Robert Kurin %A Adarsh Bhandary Panambur %A Tomas Arias-Vergara %A Andreas Maier %B Proceedings of The 9th International Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2026 %E Yuankai Huo %E Mingchen Gao %E Chang-Fu Kuo %E Yueming Jin %E Ruining Deng %F pmlr-v315-buess26a %I PMLR %P 2998--3017 %U https://proceedings.mlr.press/v315/buess26a.html %V 315 %X Radiology report generation aims to connect visual understanding with clinical language, yet most methods rely on free-text supervision, which is highly variable and difficult to evaluate. Clinical datasets are also dominated by normal findings, causing models to underreport abnormalities. While recent works focus on architectural advances, we show that structured supervision and balanced sampling can yield substantial gains in clinical performance. We convert free-text reports into structured anatomy-level representations and introduce Anatomy-Level Oversampling (ALO), a data centered sampling strategy that balances normal and abnormal findings for each anatomical region. This structure provides consistent supervision and enables more informative evaluation. Across three public datasets, ALO improves sensitivity to pathological findings while remaining fully model agnostic. On internal validation, ALO increases F1-Score by 50% and CRG by 5.8%, and on external validation, it increases F1-Score by 45.1% and CRG by 5%. These results highlight the importance of structured data and balanced sampling for reliable report generation.
APA
Buess, L., Kurin, R., Bhandary Panambur, A., Arias-Vergara, T. & Maier, A.. (2026). ALO: Addressing Class Imbalance in Radiology Report Generation through Anatomy-Level Oversampling. Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 315:2998-3017 Available from https://proceedings.mlr.press/v315/buess26a.html.

Related Material