NeuronTune: Towards Self-Guided Spurious Bias Mitigation

Guangtao Zheng, Wenqian Ye, Aidong Zhang
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:78454-78477, 2025.

Abstract

Deep neural networks often develop spurious bias, reliance on correlations between non-essential features and classes for predictions. For example, a model may identify objects based on frequently co-occurring backgrounds rather than intrinsic features, resulting in degraded performance on data lacking these correlations. Existing mitigation approaches typically depend on external annotations of spurious correlations, which may be difficult to obtain and are not relevant to the spurious bias in a model. In this paper, we take a step towards self-guided mitigation of spurious bias by proposing NeuronTune, a post hoc method that directly intervenes in a model’s internal decision process. Our method probes in a model’s latent embedding space to identify and regulate neurons that lead to spurious prediction behaviors. We theoretically justify our approach and show that it brings the model closer to an unbiased one. Unlike previous methods, NeuronTune operates without requiring spurious correlation annotations, making it a practical and effective tool for improving model robustness. Experiments across different architectures and data modalities demonstrate that our method significantly mitigates spurious bias in a self-guided way.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-zheng25s, title = {{N}euron{T}une: Towards Self-Guided Spurious Bias Mitigation}, author = {Zheng, Guangtao and Ye, Wenqian and Zhang, Aidong}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {78454--78477}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/zheng25s/zheng25s.pdf}, url = {https://proceedings.mlr.press/v267/zheng25s.html}, abstract = {Deep neural networks often develop spurious bias, reliance on correlations between non-essential features and classes for predictions. For example, a model may identify objects based on frequently co-occurring backgrounds rather than intrinsic features, resulting in degraded performance on data lacking these correlations. Existing mitigation approaches typically depend on external annotations of spurious correlations, which may be difficult to obtain and are not relevant to the spurious bias in a model. In this paper, we take a step towards self-guided mitigation of spurious bias by proposing NeuronTune, a post hoc method that directly intervenes in a model’s internal decision process. Our method probes in a model’s latent embedding space to identify and regulate neurons that lead to spurious prediction behaviors. We theoretically justify our approach and show that it brings the model closer to an unbiased one. Unlike previous methods, NeuronTune operates without requiring spurious correlation annotations, making it a practical and effective tool for improving model robustness. Experiments across different architectures and data modalities demonstrate that our method significantly mitigates spurious bias in a self-guided way.} }
Endnote
%0 Conference Paper %T NeuronTune: Towards Self-Guided Spurious Bias Mitigation %A Guangtao Zheng %A Wenqian Ye %A Aidong Zhang %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-zheng25s %I PMLR %P 78454--78477 %U https://proceedings.mlr.press/v267/zheng25s.html %V 267 %X Deep neural networks often develop spurious bias, reliance on correlations between non-essential features and classes for predictions. For example, a model may identify objects based on frequently co-occurring backgrounds rather than intrinsic features, resulting in degraded performance on data lacking these correlations. Existing mitigation approaches typically depend on external annotations of spurious correlations, which may be difficult to obtain and are not relevant to the spurious bias in a model. In this paper, we take a step towards self-guided mitigation of spurious bias by proposing NeuronTune, a post hoc method that directly intervenes in a model’s internal decision process. Our method probes in a model’s latent embedding space to identify and regulate neurons that lead to spurious prediction behaviors. We theoretically justify our approach and show that it brings the model closer to an unbiased one. Unlike previous methods, NeuronTune operates without requiring spurious correlation annotations, making it a practical and effective tool for improving model robustness. Experiments across different architectures and data modalities demonstrate that our method significantly mitigates spurious bias in a self-guided way.
APA
Zheng, G., Ye, W. & Zhang, A.. (2025). NeuronTune: Towards Self-Guided Spurious Bias Mitigation. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:78454-78477 Available from https://proceedings.mlr.press/v267/zheng25s.html.

Related Material