Play the (Mis)Match: Using fMRI-Aligned Feature Fine-Tuning to Reveal Shortcut Bias in Deep Neural Networks

Yang Chen Lin, Chiayun Lee, Po-Chih Kuo
Proceedings of the First Workshop on NeuroAI Multimodal Intelligence @ AAAI 2026, PMLR 308:99-107, 2026.

Abstract

Deep neural networks (DNNs) often “cheat” by relying on shortcut objects (e.g., food$\Rightarrow$kitchen) rather than holistic spatial layout, undermining out-of-distribution (OOD) robustness. This work serves as a proof-of-concept exploration of whether fMRI alignment can reduce shortcut bias in visual DNNs. We address this issue with Play the (Mis)Match, a diagnostic dataset and brain-aligned fine-tuning framework. Leveraging fMRI recordings from the Natural Scenes Dataset (four participants; bedroom, bathroom, living room, kitchen), we curate MATCH images in which shortcut cues co-occur as usual and MISMATCH images from which those cues are removed. ImageNet-initialised CNN and Transformer backbones are fine-tuned with an MSE alignment loss that steers their intermediate features toward voxel patterns known to be less sensitive to shortcut cues. Our results show that, for ResNet, this procedure narrows the Match–Mismatch accuracy gap by 24 % and redirects Grad-CAM attention from individual objects to holistic scene structure, particularly activity from the scene-selective cortex (PPA, RSC, OPA), all without explicit shortcut annotations. Our study provides a proof-of-concept that human-brain constraints may help steer DNNs toward more semantically grounded, less shortcut-dependent scene representations.

Cite this Paper


BibTeX
@InProceedings{pmlr-v308-lin26a, title = {Play the (Mis)Match: Using fMRI-Aligned Feature Fine-Tuning to Reveal Shortcut Bias in Deep Neural Networks}, author = {Lin, Yang Chen and Lee, Chiayun and Kuo, Po-Chih}, booktitle = {Proceedings of the First Workshop on NeuroAI Multimodal Intelligence @ AAAI 2026}, pages = {99--107}, year = {2026}, editor = {Abbasi-Asl, Reza and Iqbal, Asim and Ito, Shinya and Arkhipov, Anton and Sanborn, Sophia}, volume = {308}, series = {Proceedings of Machine Learning Research}, month = {27 Jan}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v308/main/assets/lin26a/lin26a.pdf}, url = {https://proceedings.mlr.press/v308/lin26a.html}, abstract = {Deep neural networks (DNNs) often “cheat” by relying on shortcut objects (e.g., food$\Rightarrow$kitchen) rather than holistic spatial layout, undermining out-of-distribution (OOD) robustness. This work serves as a proof-of-concept exploration of whether fMRI alignment can reduce shortcut bias in visual DNNs. We address this issue with Play the (Mis)Match, a diagnostic dataset and brain-aligned fine-tuning framework. Leveraging fMRI recordings from the Natural Scenes Dataset (four participants; bedroom, bathroom, living room, kitchen), we curate MATCH images in which shortcut cues co-occur as usual and MISMATCH images from which those cues are removed. ImageNet-initialised CNN and Transformer backbones are fine-tuned with an MSE alignment loss that steers their intermediate features toward voxel patterns known to be less sensitive to shortcut cues. Our results show that, for ResNet, this procedure narrows the Match–Mismatch accuracy gap by 24 % and redirects Grad-CAM attention from individual objects to holistic scene structure, particularly activity from the scene-selective cortex (PPA, RSC, OPA), all without explicit shortcut annotations. Our study provides a proof-of-concept that human-brain constraints may help steer DNNs toward more semantically grounded, less shortcut-dependent scene representations.} }
Endnote
%0 Conference Paper %T Play the (Mis)Match: Using fMRI-Aligned Feature Fine-Tuning to Reveal Shortcut Bias in Deep Neural Networks %A Yang Chen Lin %A Chiayun Lee %A Po-Chih Kuo %B Proceedings of the First Workshop on NeuroAI Multimodal Intelligence @ AAAI 2026 %C Proceedings of Machine Learning Research %D 2026 %E Reza Abbasi-Asl %E Asim Iqbal %E Shinya Ito %E Anton Arkhipov %E Sophia Sanborn %F pmlr-v308-lin26a %I PMLR %P 99--107 %U https://proceedings.mlr.press/v308/lin26a.html %V 308 %X Deep neural networks (DNNs) often “cheat” by relying on shortcut objects (e.g., food$\Rightarrow$kitchen) rather than holistic spatial layout, undermining out-of-distribution (OOD) robustness. This work serves as a proof-of-concept exploration of whether fMRI alignment can reduce shortcut bias in visual DNNs. We address this issue with Play the (Mis)Match, a diagnostic dataset and brain-aligned fine-tuning framework. Leveraging fMRI recordings from the Natural Scenes Dataset (four participants; bedroom, bathroom, living room, kitchen), we curate MATCH images in which shortcut cues co-occur as usual and MISMATCH images from which those cues are removed. ImageNet-initialised CNN and Transformer backbones are fine-tuned with an MSE alignment loss that steers their intermediate features toward voxel patterns known to be less sensitive to shortcut cues. Our results show that, for ResNet, this procedure narrows the Match–Mismatch accuracy gap by 24 % and redirects Grad-CAM attention from individual objects to holistic scene structure, particularly activity from the scene-selective cortex (PPA, RSC, OPA), all without explicit shortcut annotations. Our study provides a proof-of-concept that human-brain constraints may help steer DNNs toward more semantically grounded, less shortcut-dependent scene representations.
APA
Lin, Y.C., Lee, C. & Kuo, P.. (2026). Play the (Mis)Match: Using fMRI-Aligned Feature Fine-Tuning to Reveal Shortcut Bias in Deep Neural Networks. Proceedings of the First Workshop on NeuroAI Multimodal Intelligence @ AAAI 2026, in Proceedings of Machine Learning Research 308:99-107 Available from https://proceedings.mlr.press/v308/lin26a.html.

Related Material