[edit]
There Are No Shortcuts to Anywhere Worth Going: Identifying Shortcuts in Deep Learning Models for Medical Image Analysis
Proceedings of The 7nd International Conference on Medical Imaging with Deep Learning, PMLR 250:131-150, 2024.
Abstract
Many studies have reported human-level accuracy (or better) for AI-powered algorithms performing a specific clinical task, such as detecting pathology. However, these results often fail to generalize to other scanners or populations. Several mechanisms have been identified that confound generalization. One such is shortcut learning, where a network erroneously learns to depend on a fragile spurious feature, such as a text label added to the image, rather than scrutinizing the genuinely useful regions of the image. In this way, systems can exhibit misleadingly high test-set results while the labels are present but fail badly elsewhere where the relationship between the label and the spurious feature breaks down. In this paper, we investigate whether it is possible to detect shortcut learning and locate where the shortcut is happening in a neural network. We propose a novel methodology utilizing the sample difficulty metric Prediction Depth (PD) and KL divergence to identify specific layers of a neural network model where the learned features of a shortcut manifest. We demonstrate that our approach can effectively isolate these layers across several shortcuts, model architectures, and datasets. Using this, we show a correlation between the visual complexity of a shortcut, the depth of its feature manifestation within the model, and the extent to which a model relies on it. Finally, we highlight the nuanced relationship between learning rate and shortcut learning.