Learn to Expect the Unexpected: Probably Approximately Correct Domain Generalization

Vikas Garg, Adam Tauman Kalai, Katrina Ligett, Steven Wu
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:3574-3582, 2021.

Abstract

Domain generalization is the problem of machine learning when the training data and the test data come from different “domains” (data distributions). We propose an elementary theoretical model of the domain generalization problem, introducing the concept of a meta-distribution over domains. In our model, the training data available to a learning algorithm consist of multiple datasets, each from a single domain, drawn in turn from the meta-distribution. We show that our model can capture a rich range of learning phenomena specific to domain generalization for three different settings: learning with Massart noise, learning decision trees, and feature selection. We demonstrate approaches that leverage domain generalization to reduce computational or data requirements in each of these settings. Experiments demonstrate that our feature selection algorithm indeed ignores spurious correlations and improves generalization.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-garg21a, title = { Learn to Expect the Unexpected: Probably Approximately Correct Domain Generalization }, author = {Garg, Vikas and Tauman Kalai, Adam and Ligett, Katrina and Wu, Steven}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {3574--3582}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/garg21a/garg21a.pdf}, url = {https://proceedings.mlr.press/v130/garg21a.html}, abstract = { Domain generalization is the problem of machine learning when the training data and the test data come from different “domains” (data distributions). We propose an elementary theoretical model of the domain generalization problem, introducing the concept of a meta-distribution over domains. In our model, the training data available to a learning algorithm consist of multiple datasets, each from a single domain, drawn in turn from the meta-distribution. We show that our model can capture a rich range of learning phenomena specific to domain generalization for three different settings: learning with Massart noise, learning decision trees, and feature selection. We demonstrate approaches that leverage domain generalization to reduce computational or data requirements in each of these settings. Experiments demonstrate that our feature selection algorithm indeed ignores spurious correlations and improves generalization. } }
Endnote
%0 Conference Paper %T Learn to Expect the Unexpected: Probably Approximately Correct Domain Generalization %A Vikas Garg %A Adam Tauman Kalai %A Katrina Ligett %A Steven Wu %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-garg21a %I PMLR %P 3574--3582 %U https://proceedings.mlr.press/v130/garg21a.html %V 130 %X Domain generalization is the problem of machine learning when the training data and the test data come from different “domains” (data distributions). We propose an elementary theoretical model of the domain generalization problem, introducing the concept of a meta-distribution over domains. In our model, the training data available to a learning algorithm consist of multiple datasets, each from a single domain, drawn in turn from the meta-distribution. We show that our model can capture a rich range of learning phenomena specific to domain generalization for three different settings: learning with Massart noise, learning decision trees, and feature selection. We demonstrate approaches that leverage domain generalization to reduce computational or data requirements in each of these settings. Experiments demonstrate that our feature selection algorithm indeed ignores spurious correlations and improves generalization.
APA
Garg, V., Tauman Kalai, A., Ligett, K. & Wu, S.. (2021). Learn to Expect the Unexpected: Probably Approximately Correct Domain Generalization . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:3574-3582 Available from https://proceedings.mlr.press/v130/garg21a.html.

Related Material