Unsupervised Concept Discovery Mitigates Spurious Correlations

Md Rifat Arefin, Yan Zhang, Aristide Baratin, Francesco Locatello, Irina Rish, Dianbo Liu, Kenji Kawaguchi
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:1672-1688, 2024.

Abstract

Models prone to spurious correlations in training data often produce brittle predictions and introduce unintended biases. Addressing this challenge typically involves methods relying on prior knowledge and group annotation to remove spurious correlations, which may not be readily available in many applications. In this paper, we establish a novel connection between unsupervised object-centric learning and mitigation of spurious correlations. Instead of directly inferring subgroups with varying correlations with labels, our approach focuses on discovering concepts: discrete ideas that are shared across input samples. Leveraging existing object-centric representation learning, we introduce CoBalT: a concept balancing technique that effectively mitigates spurious correlations without requiring human labeling of subgroups. Evaluation across the benchmark datasets for sub-population shifts demonstrate superior or competitive performance compared state-of-the-art baselines, without the need for group annotation. Code is available at https://github.com/rarefin/CoBalT

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-arefin24a, title = {Unsupervised Concept Discovery Mitigates Spurious Correlations}, author = {Arefin, Md Rifat and Zhang, Yan and Baratin, Aristide and Locatello, Francesco and Rish, Irina and Liu, Dianbo and Kawaguchi, Kenji}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {1672--1688}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/arefin24a/arefin24a.pdf}, url = {https://proceedings.mlr.press/v235/arefin24a.html}, abstract = {Models prone to spurious correlations in training data often produce brittle predictions and introduce unintended biases. Addressing this challenge typically involves methods relying on prior knowledge and group annotation to remove spurious correlations, which may not be readily available in many applications. In this paper, we establish a novel connection between unsupervised object-centric learning and mitigation of spurious correlations. Instead of directly inferring subgroups with varying correlations with labels, our approach focuses on discovering concepts: discrete ideas that are shared across input samples. Leveraging existing object-centric representation learning, we introduce CoBalT: a concept balancing technique that effectively mitigates spurious correlations without requiring human labeling of subgroups. Evaluation across the benchmark datasets for sub-population shifts demonstrate superior or competitive performance compared state-of-the-art baselines, without the need for group annotation. Code is available at https://github.com/rarefin/CoBalT} }
Endnote
%0 Conference Paper %T Unsupervised Concept Discovery Mitigates Spurious Correlations %A Md Rifat Arefin %A Yan Zhang %A Aristide Baratin %A Francesco Locatello %A Irina Rish %A Dianbo Liu %A Kenji Kawaguchi %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-arefin24a %I PMLR %P 1672--1688 %U https://proceedings.mlr.press/v235/arefin24a.html %V 235 %X Models prone to spurious correlations in training data often produce brittle predictions and introduce unintended biases. Addressing this challenge typically involves methods relying on prior knowledge and group annotation to remove spurious correlations, which may not be readily available in many applications. In this paper, we establish a novel connection between unsupervised object-centric learning and mitigation of spurious correlations. Instead of directly inferring subgroups with varying correlations with labels, our approach focuses on discovering concepts: discrete ideas that are shared across input samples. Leveraging existing object-centric representation learning, we introduce CoBalT: a concept balancing technique that effectively mitigates spurious correlations without requiring human labeling of subgroups. Evaluation across the benchmark datasets for sub-population shifts demonstrate superior or competitive performance compared state-of-the-art baselines, without the need for group annotation. Code is available at https://github.com/rarefin/CoBalT
APA
Arefin, M.R., Zhang, Y., Baratin, A., Locatello, F., Rish, I., Liu, D. & Kawaguchi, K.. (2024). Unsupervised Concept Discovery Mitigates Spurious Correlations. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:1672-1688 Available from https://proceedings.mlr.press/v235/arefin24a.html.

Related Material