In Search of Forgotten Domain Generalization

Prasanna Mayilvahanan; Roland S. Zimmermann; Thaddäus Wiedemer; Evgenia Rusak; Attila Juhos; Matthias Bethge; Wieland Brendel

In Search of Forgotten Domain Generalization

Prasanna Mayilvahanan, Roland S. Zimmermann, Thaddäus Wiedemer, Evgenia Rusak, Attila Juhos, Matthias Bethge, Wieland Brendel

Proceedings on "I Can't Believe It's Not Better: Challenges in Applied Deep Learning" at ICLR 2025 Workshops, PMLR 296:90-130, 2025.

Abstract

Out-of-Domain (OOD) generalization is the ability of a model trained on one or more domains to generalize to unseen domains. In the ImageNet era of computer vision, evaluation sets for measuring a model’s OOD performance were designed to be strictly OOD with respect to style. However, the emergence of foundation models and expansive web-scale datasets has obfuscated this evaluation process, as datasets cover a broad range of domains and risk test domain contamination. In search of the forgotten domain generalization, we create large-scale datasets subsampled from LAION—LAION-Natural and LAION-Rendition—that are strictly OOD to corresponding ImageNet and DomainNet test sets in terms of style. Training CLIP models on these datasets reveals that a significant portion of their performance is explained by in-domain examples. This indicates that the OOD generalization challenges from the ImageNet era still prevail and that training on web-scale data merely creates the illusion of OOD generalization. Furthermore, through a systematic exploration of combining natural and rendition datasets in varying proportions, we identify optimal mixing ratios for model generalization across these domains. Our datasets and results re-enable meaningful assessment of OOD robustness at scale—a crucial prerequisite for improving model robustness.

Cite this Paper

BibTeX

@InProceedings{pmlr-v296-mayilvahanan25a,
  title = 	 {In Search of Forgotten Domain Generalization},
  author =       {Mayilvahanan, Prasanna and Zimmermann, Roland S. and Wiedemer, Thadd\"{a}us and Rusak, Evgenia and Juhos, Attila and Bethge, Matthias and Brendel, Wieland},
  booktitle = 	 {Proceedings on "I Can't Believe It's Not Better: Challenges in Applied Deep Learning" at ICLR 2025 Workshops},
  pages = 	 {90--130},
  year = 	 {2025},
  editor = 	 {Blaas, Arno and D’Costa, Priya and Feng, Fan and Kriegler, Andreas and Mason, Ian and Pan, Zhaoying and Uelwer, Tobias and Williams, Jennifer and Xie, Yubin and Yang, Rui},
  volume = 	 {296},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {28 Apr},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v296/main/assets/mayilvahanan25a/mayilvahanan25a.pdf},
  url = 	 {https://proceedings.mlr.press/v296/mayilvahanan25a.html},
  abstract = 	 {Out-of-Domain (OOD) generalization is the ability of a model trained on one or more domains to generalize to unseen domains. In the ImageNet era of computer vision, evaluation sets for measuring a model’s OOD performance were designed to be strictly OOD with respect to style. However, the emergence of foundation models and expansive web-scale datasets has obfuscated this evaluation process, as datasets cover a broad range of domains and risk test domain contamination. In search of the forgotten domain generalization, we create large-scale datasets subsampled from LAION—LAION-Natural and LAION-Rendition—that are strictly OOD to corresponding ImageNet and DomainNet test sets in terms of style. Training CLIP models on these datasets reveals that a significant portion of their performance is explained by in-domain examples. This indicates that the OOD generalization challenges from the ImageNet era still prevail and that training on web-scale data merely creates the illusion of OOD generalization. Furthermore, through a systematic exploration of combining natural and rendition datasets in varying proportions, we identify optimal mixing ratios for model generalization across these domains. Our datasets and results re-enable meaningful assessment of OOD robustness at scale—a crucial prerequisite for improving model robustness.}
}

Endnote

%0 Conference Paper
%T In Search of Forgotten Domain Generalization
%A Prasanna Mayilvahanan
%A Roland S. Zimmermann
%A Thaddäus Wiedemer
%A Evgenia Rusak
%A Attila Juhos
%A Matthias Bethge
%A Wieland Brendel
%B Proceedings on "I Can't Believe It's Not Better: Challenges in Applied Deep Learning" at ICLR 2025 Workshops
%C Proceedings of Machine Learning Research
%D 2025
%E Arno Blaas
%E Priya D’Costa
%E Fan Feng
%E Andreas Kriegler
%E Ian Mason
%E Zhaoying Pan
%E Tobias Uelwer
%E Jennifer Williams
%E Yubin Xie
%E Rui Yang	
%F pmlr-v296-mayilvahanan25a
%I PMLR
%P 90--130
%U https://proceedings.mlr.press/v296/mayilvahanan25a.html
%V 296
%X Out-of-Domain (OOD) generalization is the ability of a model trained on one or more domains to generalize to unseen domains. In the ImageNet era of computer vision, evaluation sets for measuring a model’s OOD performance were designed to be strictly OOD with respect to style. However, the emergence of foundation models and expansive web-scale datasets has obfuscated this evaluation process, as datasets cover a broad range of domains and risk test domain contamination. In search of the forgotten domain generalization, we create large-scale datasets subsampled from LAION—LAION-Natural and LAION-Rendition—that are strictly OOD to corresponding ImageNet and DomainNet test sets in terms of style. Training CLIP models on these datasets reveals that a significant portion of their performance is explained by in-domain examples. This indicates that the OOD generalization challenges from the ImageNet era still prevail and that training on web-scale data merely creates the illusion of OOD generalization. Furthermore, through a systematic exploration of combining natural and rendition datasets in varying proportions, we identify optimal mixing ratios for model generalization across these domains. Our datasets and results re-enable meaningful assessment of OOD robustness at scale—a crucial prerequisite for improving model robustness.

APA

Mayilvahanan, P., Zimmermann, R.S., Wiedemer, T., Rusak, E., Juhos, A., Bethge, M. & Brendel, W.. (2025). In Search of Forgotten Domain Generalization. Proceedings on "I Can't Believe It's Not Better: Challenges in Applied Deep Learning" at ICLR 2025 Workshops, in Proceedings of Machine Learning Research 296:90-130 Available from https://proceedings.mlr.press/v296/mayilvahanan25a.html.

Related Material

Download PDF