[edit]
KidneyGrader: Fine-Grained Tubulitis Scoring Using Weakly Supervised Transformers
Proceedings of the MICCAI Workshop on Computational Pathology, PMLR 316:106-115, 2026.
Abstract
Accurate tubulitis scoring is essential for managing kidney transplant rejection, yet manual assessment is subjective and suffers from severe inter-rater variability ($\kappa$w=0.17), leading to inconsistent treatment decisions. While recent works have attempted binary tubulitis detection, fine-grained scoring (T0-T3) required for clinical decision-making remains unaddressed. We present the first automated approach for granular tubulitis scoring using only slide-level supervision. Our approach aggregates spatially correlated features from tubulecentric image patches using a transformer-based attention pooling mechanism. To ensure diagnostic focus, patches are pre-filtered using a segmentation model trained to detect renal tubules, restricting the input space to regions most relevant for scoring. Evaluated on 93 routine PAS-stained slides (75 for training/validation, 18 held-out test), our method achieves a weighted kappa of $\kappa$w = 0.75 (4.4$\times$ improvement over expert agreement), 83.3% within-one-grade accuracy, and strong correlation with expert scores (r = 0.81). Topattended regions demonstrate clinical plausibility, showing progressively greater inflammatory burden and tissue damage features with increasing T-scores. Our work demonstrates that weakly supervised learning can transform subjective pathology assessments into reliable, interpretable predictions, offering a practical path towards standardising transplant rejection diagnosis. The code is available on github.