Prompting is a Double-Edged Sword: Improving Worst-Group Robustness of Foundation Models

Amrith Setlur, Saurabh Garg, Virginia Smith, Sergey Levine
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:44224-44243, 2024.

Abstract

Machine learning models fail catastrophically under distribution shift, but a surprisingly effective way to empirically improve robustness to some types of shift (e.g., Imagenet-A/C) is to use stronger open-vocabulary classifiers derived from foundation models. In this work, we first note that for shifts governed by spurious correlations (features spuriously correlated with the label on the training data, but not on test), the zero-shot and few-shot performance of foundation models is no better than ERM models, and remains unchanged when pretrained data/model size is scaled. Secondly, even in these situations, foundation models are quite accurate at predicting the value of the spurious feature. In a simplified setup, we theoretically analyze both these findings. Specifically, we show that during contrastive pretraining, the simplicity bias of foundation models tends to result in the learning of features that mostly rely on the spurious attribute, compared to more robust features. We leverage these observations to propose Prompting for Robustness (PfR) which first uses foundation models to zero-shot predict the spurious attribute on labeled examples, and then learns a classifier with balanced performance across different groups of labels and spurious attribute. Across 5 vision and language tasks, we show that PfR’s performance nearly equals that of an oracle algorithm (group DRO) that leverages human labeled spurious attributes.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-setlur24a, title = {Prompting is a Double-Edged Sword: Improving Worst-Group Robustness of Foundation Models}, author = {Setlur, Amrith and Garg, Saurabh and Smith, Virginia and Levine, Sergey}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {44224--44243}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/setlur24a/setlur24a.pdf}, url = {https://proceedings.mlr.press/v235/setlur24a.html}, abstract = {Machine learning models fail catastrophically under distribution shift, but a surprisingly effective way to empirically improve robustness to some types of shift (e.g., Imagenet-A/C) is to use stronger open-vocabulary classifiers derived from foundation models. In this work, we first note that for shifts governed by spurious correlations (features spuriously correlated with the label on the training data, but not on test), the zero-shot and few-shot performance of foundation models is no better than ERM models, and remains unchanged when pretrained data/model size is scaled. Secondly, even in these situations, foundation models are quite accurate at predicting the value of the spurious feature. In a simplified setup, we theoretically analyze both these findings. Specifically, we show that during contrastive pretraining, the simplicity bias of foundation models tends to result in the learning of features that mostly rely on the spurious attribute, compared to more robust features. We leverage these observations to propose Prompting for Robustness (PfR) which first uses foundation models to zero-shot predict the spurious attribute on labeled examples, and then learns a classifier with balanced performance across different groups of labels and spurious attribute. Across 5 vision and language tasks, we show that PfR’s performance nearly equals that of an oracle algorithm (group DRO) that leverages human labeled spurious attributes.} }
Endnote
%0 Conference Paper %T Prompting is a Double-Edged Sword: Improving Worst-Group Robustness of Foundation Models %A Amrith Setlur %A Saurabh Garg %A Virginia Smith %A Sergey Levine %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-setlur24a %I PMLR %P 44224--44243 %U https://proceedings.mlr.press/v235/setlur24a.html %V 235 %X Machine learning models fail catastrophically under distribution shift, but a surprisingly effective way to empirically improve robustness to some types of shift (e.g., Imagenet-A/C) is to use stronger open-vocabulary classifiers derived from foundation models. In this work, we first note that for shifts governed by spurious correlations (features spuriously correlated with the label on the training data, but not on test), the zero-shot and few-shot performance of foundation models is no better than ERM models, and remains unchanged when pretrained data/model size is scaled. Secondly, even in these situations, foundation models are quite accurate at predicting the value of the spurious feature. In a simplified setup, we theoretically analyze both these findings. Specifically, we show that during contrastive pretraining, the simplicity bias of foundation models tends to result in the learning of features that mostly rely on the spurious attribute, compared to more robust features. We leverage these observations to propose Prompting for Robustness (PfR) which first uses foundation models to zero-shot predict the spurious attribute on labeled examples, and then learns a classifier with balanced performance across different groups of labels and spurious attribute. Across 5 vision and language tasks, we show that PfR’s performance nearly equals that of an oracle algorithm (group DRO) that leverages human labeled spurious attributes.
APA
Setlur, A., Garg, S., Smith, V. & Levine, S.. (2024). Prompting is a Double-Edged Sword: Improving Worst-Group Robustness of Foundation Models. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:44224-44243 Available from https://proceedings.mlr.press/v235/setlur24a.html.

Related Material