Federated Class-Heterogeneous Report Labeling with Surgical Aggregation

Nikhil Shah, Pranav Kulkarni, Florence Doo, Ang Li, Michael A. Jacobs, Vishwa Sanjay Parekh
Proceedings of The 8th International Conference on Medical Imaging with Deep Learning, PMLR 301:1418-1429, 2026.

Abstract

Labeling radiology reports is essential for creating medical imaging datasets and enabling AI-driven clinical decision support. While SBERT-based classifiers offer computationally efficient solutions for this task, a major challenge is the class heterogeneity across datasets, as different groups focus on extracting distinct disease labels. For instance, NIH and CheXpert CXR datasets share only 7 of their 14 and 13 labels, respectively. To address this, we propose to use Surgical Aggregation, a class-heterogeneous federated learning framework that collaboratively trains a global multi-label classifier without requiring alignment of labeling schemes across clients. Surgical Aggregation selectively merges shared class weights while appending new disease-specific nodes, thereby unifying distinct local labeling priorities, to dynamically incorporate all disease labels of interest. We evaluated Surgical Aggregation in multiple simulated settings with varying number of participating nodes as well as different degrees of overlapping labels. Our results demonstrate high performance confirming adaptability in class-heterogeneous environments, thereby offering a scalable and privacy-preserving solution for collaborative medical report labeling. Our code is available at https://github.com/BioIntelligence-Lab/Federated-MedEmbedX

Cite this Paper


BibTeX
@InProceedings{pmlr-v301-shah26a, title = {Federated Class-Heterogeneous Report Labeling with Surgical Aggregation}, author = {Shah, Nikhil and Kulkarni, Pranav and Doo, Florence and Li, Ang and Jacobs, Michael A. and Parekh, Vishwa Sanjay}, booktitle = {Proceedings of The 8th International Conference on Medical Imaging with Deep Learning}, pages = {1418--1429}, year = {2026}, editor = {Tasdizen, Tolga and Elhabian, Shireen and Summers, Ronald and Chen, Chen and Koch, Lisa and Zhuang, Yan}, volume = {301}, series = {Proceedings of Machine Learning Research}, month = {09--11 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v301/main/assets/shah26a/shah26a.pdf}, url = {https://proceedings.mlr.press/v301/shah26a.html}, abstract = {Labeling radiology reports is essential for creating medical imaging datasets and enabling AI-driven clinical decision support. While SBERT-based classifiers offer computationally efficient solutions for this task, a major challenge is the class heterogeneity across datasets, as different groups focus on extracting distinct disease labels. For instance, NIH and CheXpert CXR datasets share only 7 of their 14 and 13 labels, respectively. To address this, we propose to use Surgical Aggregation, a class-heterogeneous federated learning framework that collaboratively trains a global multi-label classifier without requiring alignment of labeling schemes across clients. Surgical Aggregation selectively merges shared class weights while appending new disease-specific nodes, thereby unifying distinct local labeling priorities, to dynamically incorporate all disease labels of interest. We evaluated Surgical Aggregation in multiple simulated settings with varying number of participating nodes as well as different degrees of overlapping labels. Our results demonstrate high performance confirming adaptability in class-heterogeneous environments, thereby offering a scalable and privacy-preserving solution for collaborative medical report labeling. Our code is available at https://github.com/BioIntelligence-Lab/Federated-MedEmbedX} }
Endnote
%0 Conference Paper %T Federated Class-Heterogeneous Report Labeling with Surgical Aggregation %A Nikhil Shah %A Pranav Kulkarni %A Florence Doo %A Ang Li %A Michael A. Jacobs %A Vishwa Sanjay Parekh %B Proceedings of The 8th International Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2026 %E Tolga Tasdizen %E Shireen Elhabian %E Ronald Summers %E Chen Chen %E Lisa Koch %E Yan Zhuang %F pmlr-v301-shah26a %I PMLR %P 1418--1429 %U https://proceedings.mlr.press/v301/shah26a.html %V 301 %X Labeling radiology reports is essential for creating medical imaging datasets and enabling AI-driven clinical decision support. While SBERT-based classifiers offer computationally efficient solutions for this task, a major challenge is the class heterogeneity across datasets, as different groups focus on extracting distinct disease labels. For instance, NIH and CheXpert CXR datasets share only 7 of their 14 and 13 labels, respectively. To address this, we propose to use Surgical Aggregation, a class-heterogeneous federated learning framework that collaboratively trains a global multi-label classifier without requiring alignment of labeling schemes across clients. Surgical Aggregation selectively merges shared class weights while appending new disease-specific nodes, thereby unifying distinct local labeling priorities, to dynamically incorporate all disease labels of interest. We evaluated Surgical Aggregation in multiple simulated settings with varying number of participating nodes as well as different degrees of overlapping labels. Our results demonstrate high performance confirming adaptability in class-heterogeneous environments, thereby offering a scalable and privacy-preserving solution for collaborative medical report labeling. Our code is available at https://github.com/BioIntelligence-Lab/Federated-MedEmbedX
APA
Shah, N., Kulkarni, P., Doo, F., Li, A., Jacobs, M.A. & Parekh, V.S.. (2026). Federated Class-Heterogeneous Report Labeling with Surgical Aggregation. Proceedings of The 8th International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 301:1418-1429 Available from https://proceedings.mlr.press/v301/shah26a.html.

Related Material