Does Invariant Risk Minimization Capture Invariance?

Pritish Kamath, Akilesh Tangella, Danica Sutherland, Nathan Srebro
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:4069-4077, 2021.

Abstract

We show that the Invariant Risk Minimization (IRM) formulation of Arjovsky et al. (2019) can fail to capture "natural" invariances, at least when used in its practical "linear" form, and even on very simple problems which directly follow the motivating examples for IRM. This can lead to worse generalization on new environments, even when compared to unconstrained ERM. The issue stems from a significant gap between the linear variant (as in their concrete method IRMv1) and the full non-linear IRM formulation. Additionally, even when capturing the "right" invariances, we show that it is possible for IRM to learn a sub-optimal predictor, due to the loss function not being invariant across environments. The issues arise even when measuring invariance on the population distributions, but are exacerbated by the fact that IRM is extremely fragile to sampling.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-kamath21a, title = { Does Invariant Risk Minimization Capture Invariance? }, author = {Kamath, Pritish and Tangella, Akilesh and Sutherland, Danica and Srebro, Nathan}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {4069--4077}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/kamath21a/kamath21a.pdf}, url = {https://proceedings.mlr.press/v130/kamath21a.html}, abstract = { We show that the Invariant Risk Minimization (IRM) formulation of Arjovsky et al. (2019) can fail to capture "natural" invariances, at least when used in its practical "linear" form, and even on very simple problems which directly follow the motivating examples for IRM. This can lead to worse generalization on new environments, even when compared to unconstrained ERM. The issue stems from a significant gap between the linear variant (as in their concrete method IRMv1) and the full non-linear IRM formulation. Additionally, even when capturing the "right" invariances, we show that it is possible for IRM to learn a sub-optimal predictor, due to the loss function not being invariant across environments. The issues arise even when measuring invariance on the population distributions, but are exacerbated by the fact that IRM is extremely fragile to sampling. } }
Endnote
%0 Conference Paper %T Does Invariant Risk Minimization Capture Invariance? %A Pritish Kamath %A Akilesh Tangella %A Danica Sutherland %A Nathan Srebro %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-kamath21a %I PMLR %P 4069--4077 %U https://proceedings.mlr.press/v130/kamath21a.html %V 130 %X We show that the Invariant Risk Minimization (IRM) formulation of Arjovsky et al. (2019) can fail to capture "natural" invariances, at least when used in its practical "linear" form, and even on very simple problems which directly follow the motivating examples for IRM. This can lead to worse generalization on new environments, even when compared to unconstrained ERM. The issue stems from a significant gap between the linear variant (as in their concrete method IRMv1) and the full non-linear IRM formulation. Additionally, even when capturing the "right" invariances, we show that it is possible for IRM to learn a sub-optimal predictor, due to the loss function not being invariant across environments. The issues arise even when measuring invariance on the population distributions, but are exacerbated by the fact that IRM is extremely fragile to sampling.
APA
Kamath, P., Tangella, A., Sutherland, D. & Srebro, N.. (2021). Does Invariant Risk Minimization Capture Invariance? . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:4069-4077 Available from https://proceedings.mlr.press/v130/kamath21a.html.

Related Material