Causally motivated shortcut removal using auxiliary labels

Maggie Makar, Ben Packer, Dan Moldovan, Davis Blalock, Yoni Halpern, Alexander D’Amour
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:739-766, 2022.

Abstract

Shortcut learning, in which models make use of easy-to-represent but unstable associations, is a major failure mode for robust machine learning. We study a flexible, causally-motivated approach to training robust predictors by discouraging the use of specific shortcuts, focusing on a common setting where a robust predictor could achieve optimal i.i.d generalization in principle, but is overshadowed by a shortcut predictor in practice. Our approach uses auxiliary labels, typically available at training time, to enforce conditional independences implied by the causal graph. We show both theoretically and empirically that causally-motivated regularization schemes (a) lead to more robust estimators that generalize well under distribution shift, and (b) have better finite sample efficiency compared to usual regularization schemes, even when no shortcut is present. Our analysis highlights important theoretical properties of training techniques commonly used in the causal inference, fairness, and disentanglement literatures. Our code is available at github.com/mymakar/causally_motivated_shortcut_removal

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-makar22a, title = { Causally motivated shortcut removal using auxiliary labels }, author = {Makar, Maggie and Packer, Ben and Moldovan, Dan and Blalock, Davis and Halpern, Yoni and D'Amour, Alexander}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {739--766}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/makar22a/makar22a.pdf}, url = {https://proceedings.mlr.press/v151/makar22a.html}, abstract = { Shortcut learning, in which models make use of easy-to-represent but unstable associations, is a major failure mode for robust machine learning. We study a flexible, causally-motivated approach to training robust predictors by discouraging the use of specific shortcuts, focusing on a common setting where a robust predictor could achieve optimal i.i.d generalization in principle, but is overshadowed by a shortcut predictor in practice. Our approach uses auxiliary labels, typically available at training time, to enforce conditional independences implied by the causal graph. We show both theoretically and empirically that causally-motivated regularization schemes (a) lead to more robust estimators that generalize well under distribution shift, and (b) have better finite sample efficiency compared to usual regularization schemes, even when no shortcut is present. Our analysis highlights important theoretical properties of training techniques commonly used in the causal inference, fairness, and disentanglement literatures. Our code is available at github.com/mymakar/causally_motivated_shortcut_removal } }
Endnote
%0 Conference Paper %T Causally motivated shortcut removal using auxiliary labels %A Maggie Makar %A Ben Packer %A Dan Moldovan %A Davis Blalock %A Yoni Halpern %A Alexander D’Amour %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-makar22a %I PMLR %P 739--766 %U https://proceedings.mlr.press/v151/makar22a.html %V 151 %X Shortcut learning, in which models make use of easy-to-represent but unstable associations, is a major failure mode for robust machine learning. We study a flexible, causally-motivated approach to training robust predictors by discouraging the use of specific shortcuts, focusing on a common setting where a robust predictor could achieve optimal i.i.d generalization in principle, but is overshadowed by a shortcut predictor in practice. Our approach uses auxiliary labels, typically available at training time, to enforce conditional independences implied by the causal graph. We show both theoretically and empirically that causally-motivated regularization schemes (a) lead to more robust estimators that generalize well under distribution shift, and (b) have better finite sample efficiency compared to usual regularization schemes, even when no shortcut is present. Our analysis highlights important theoretical properties of training techniques commonly used in the causal inference, fairness, and disentanglement literatures. Our code is available at github.com/mymakar/causally_motivated_shortcut_removal
APA
Makar, M., Packer, B., Moldovan, D., Blalock, D., Halpern, Y. & D’Amour, A.. (2022). Causally motivated shortcut removal using auxiliary labels . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:739-766 Available from https://proceedings.mlr.press/v151/makar22a.html.

Related Material