Estimating Generalization under Distribution Shifts via Domain-Invariant Representations

Ching-Yao Chuang, Antonio Torralba, Stefanie Jegelka
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:1984-1994, 2020.

Abstract

When machine learning models are deployed on a test distribution different from the training distribution, they can perform poorly, but overestimate their performance. In this work, we aim to better estimate a model’s performance under distribution shift, without supervision. To do so, we use a set of domain-invariant predictors as a proxy for the unknown, true target labels. Since the error of the resulting risk estimate depends on the target risk of the proxy model, we study generalization of domain-invariant representations and show that the complexity of the latent representation has a significant influence on the target risk. Empirically, our approach (1) enables self-tuning of domain adaptation models, and (2) accurately estimates the target error of given models under distribution shift. Other applications include model selection, deciding early stopping and error detection.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-chuang20a, title = {Estimating Generalization under Distribution Shifts via Domain-Invariant Representations}, author = {Chuang, Ching-Yao and Torralba, Antonio and Jegelka, Stefanie}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {1984--1994}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/chuang20a/chuang20a.pdf}, url = {https://proceedings.mlr.press/v119/chuang20a.html}, abstract = {When machine learning models are deployed on a test distribution different from the training distribution, they can perform poorly, but overestimate their performance. In this work, we aim to better estimate a model’s performance under distribution shift, without supervision. To do so, we use a set of domain-invariant predictors as a proxy for the unknown, true target labels. Since the error of the resulting risk estimate depends on the target risk of the proxy model, we study generalization of domain-invariant representations and show that the complexity of the latent representation has a significant influence on the target risk. Empirically, our approach (1) enables self-tuning of domain adaptation models, and (2) accurately estimates the target error of given models under distribution shift. Other applications include model selection, deciding early stopping and error detection.} }
Endnote
%0 Conference Paper %T Estimating Generalization under Distribution Shifts via Domain-Invariant Representations %A Ching-Yao Chuang %A Antonio Torralba %A Stefanie Jegelka %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-chuang20a %I PMLR %P 1984--1994 %U https://proceedings.mlr.press/v119/chuang20a.html %V 119 %X When machine learning models are deployed on a test distribution different from the training distribution, they can perform poorly, but overestimate their performance. In this work, we aim to better estimate a model’s performance under distribution shift, without supervision. To do so, we use a set of domain-invariant predictors as a proxy for the unknown, true target labels. Since the error of the resulting risk estimate depends on the target risk of the proxy model, we study generalization of domain-invariant representations and show that the complexity of the latent representation has a significant influence on the target risk. Empirically, our approach (1) enables self-tuning of domain adaptation models, and (2) accurately estimates the target error of given models under distribution shift. Other applications include model selection, deciding early stopping and error detection.
APA
Chuang, C., Torralba, A. & Jegelka, S.. (2020). Estimating Generalization under Distribution Shifts via Domain-Invariant Representations. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:1984-1994 Available from https://proceedings.mlr.press/v119/chuang20a.html.

Related Material