Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators

Jianhao Yuan, Francesco Pinto, Adam Davies, Philip Torr
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:57924-57952, 2024.

Abstract

Neural image classifiers are known to undergo severe performance degradation when exposed to inputs that are sampled from environmental conditions that differ from their training data. Given the recent progress in Text-to-Image (T2I) generation, a natural question is how modern T2I generators can be used to simulate arbitrary interventions over such environmental factors in order to augment training data and improve the robustness of downstream classifiers. We experiment across a diverse collection of benchmarks in single domain generalization (SDG) and reducing reliance on spurious features (RRSF), ablating across key dimensions of T2I generation, including interventional prompting strategies, conditioning mechanisms, and post-hoc filtering, showing that modern T2I generators like Stable Diffusion can indeed be used to implement a powerful interventional data augmentation (IDA) mechanism, outperforming previously state-of-the-art data augmentation techniques regardless of how each dimension is configured.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-yuan24e, title = {Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators}, author = {Yuan, Jianhao and Pinto, Francesco and Davies, Adam and Torr, Philip}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {57924--57952}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/yuan24e/yuan24e.pdf}, url = {https://proceedings.mlr.press/v235/yuan24e.html}, abstract = {Neural image classifiers are known to undergo severe performance degradation when exposed to inputs that are sampled from environmental conditions that differ from their training data. Given the recent progress in Text-to-Image (T2I) generation, a natural question is how modern T2I generators can be used to simulate arbitrary interventions over such environmental factors in order to augment training data and improve the robustness of downstream classifiers. We experiment across a diverse collection of benchmarks in single domain generalization (SDG) and reducing reliance on spurious features (RRSF), ablating across key dimensions of T2I generation, including interventional prompting strategies, conditioning mechanisms, and post-hoc filtering, showing that modern T2I generators like Stable Diffusion can indeed be used to implement a powerful interventional data augmentation (IDA) mechanism, outperforming previously state-of-the-art data augmentation techniques regardless of how each dimension is configured.} }
Endnote
%0 Conference Paper %T Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators %A Jianhao Yuan %A Francesco Pinto %A Adam Davies %A Philip Torr %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-yuan24e %I PMLR %P 57924--57952 %U https://proceedings.mlr.press/v235/yuan24e.html %V 235 %X Neural image classifiers are known to undergo severe performance degradation when exposed to inputs that are sampled from environmental conditions that differ from their training data. Given the recent progress in Text-to-Image (T2I) generation, a natural question is how modern T2I generators can be used to simulate arbitrary interventions over such environmental factors in order to augment training data and improve the robustness of downstream classifiers. We experiment across a diverse collection of benchmarks in single domain generalization (SDG) and reducing reliance on spurious features (RRSF), ablating across key dimensions of T2I generation, including interventional prompting strategies, conditioning mechanisms, and post-hoc filtering, showing that modern T2I generators like Stable Diffusion can indeed be used to implement a powerful interventional data augmentation (IDA) mechanism, outperforming previously state-of-the-art data augmentation techniques regardless of how each dimension is configured.
APA
Yuan, J., Pinto, F., Davies, A. & Torr, P.. (2024). Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:57924-57952 Available from https://proceedings.mlr.press/v235/yuan24e.html.

Related Material