[edit]
Weight Space Correlation Analysis: Quantifying Feature Utilization in Deep Learning Models
Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, PMLR 315:2711-2737, 2026.
Abstract
Deep learning models in medical imaging are susceptible to shortcut learning, relying on confounding metadata (e.g. scanner model) that is often encoded in image embeddings. The crucial question is whether the model actively utilizes this encoded information for its final prediction. We introduce Weight Space Correlation analysis, an interpretable methodology that quantifies feature utilization by measuring the alignment between the classification heads of a primary clinical task and auxiliary metadata tasks. We first validate our method by successfully detecting artificially induced shortcut learning. We then apply it to probe the feature utilization of an SA-SonoNet model trained for Spontaneous Preterm Birth (sPTB) prediction. Our analysis confirmed that while the embeddings contain substantial metadata, the sPTB classifier’s weight vectors were highly correlated with clinically relevant factors (e.g. cervical length) but decoupled from clinically irrelevant acquisition factors (e.g. scanner). Our methodology provides a tool for verifying model trustworthiness, by inspecting whether it utilizes features unrelated to the genuine clinical signal.