Automatic Shortcut Removal for Self-Supervised Representation Learning

Matthias Minderer, Olivier Bachem, Neil Houlsby, Michael Tschannen
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:6927-6937, 2020.

Abstract

In self-supervised visual representation learning, a feature extractor is trained on a "pretext task" for which labels can be generated cheaply, without human annotation. A central challenge in this approach is that the feature extractor quickly learns to exploit low-level visual features such as color aberrations or watermarks and then fails to learn useful semantic representations. Much work has gone into identifying such "shortcut" features and hand-designing schemes to reduce their effect. Here, we propose a general framework for mitigating the effect shortcut features. Our key assumption is that those features which are the first to be exploited for solving the pretext task may also be the most vulnerable to an adversary trained to make the task harder. We show that this assumption holds across common pretext tasks and datasets by training a "lens" network to make small image changes that maximally reduce performance in the pretext task. Representations learned with the modified images outperform those learned without in all tested cases. Additionally, the modifications made by the lens reveal how the choice of pretext task and dataset affects the features learned by self-supervision.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-minderer20a, title = {Automatic Shortcut Removal for Self-Supervised Representation Learning}, author = {Minderer, Matthias and Bachem, Olivier and Houlsby, Neil and Tschannen, Michael}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {6927--6937}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/minderer20a/minderer20a.pdf}, url = {https://proceedings.mlr.press/v119/minderer20a.html}, abstract = {In self-supervised visual representation learning, a feature extractor is trained on a "pretext task" for which labels can be generated cheaply, without human annotation. A central challenge in this approach is that the feature extractor quickly learns to exploit low-level visual features such as color aberrations or watermarks and then fails to learn useful semantic representations. Much work has gone into identifying such "shortcut" features and hand-designing schemes to reduce their effect. Here, we propose a general framework for mitigating the effect shortcut features. Our key assumption is that those features which are the first to be exploited for solving the pretext task may also be the most vulnerable to an adversary trained to make the task harder. We show that this assumption holds across common pretext tasks and datasets by training a "lens" network to make small image changes that maximally reduce performance in the pretext task. Representations learned with the modified images outperform those learned without in all tested cases. Additionally, the modifications made by the lens reveal how the choice of pretext task and dataset affects the features learned by self-supervision.} }
Endnote
%0 Conference Paper %T Automatic Shortcut Removal for Self-Supervised Representation Learning %A Matthias Minderer %A Olivier Bachem %A Neil Houlsby %A Michael Tschannen %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-minderer20a %I PMLR %P 6927--6937 %U https://proceedings.mlr.press/v119/minderer20a.html %V 119 %X In self-supervised visual representation learning, a feature extractor is trained on a "pretext task" for which labels can be generated cheaply, without human annotation. A central challenge in this approach is that the feature extractor quickly learns to exploit low-level visual features such as color aberrations or watermarks and then fails to learn useful semantic representations. Much work has gone into identifying such "shortcut" features and hand-designing schemes to reduce their effect. Here, we propose a general framework for mitigating the effect shortcut features. Our key assumption is that those features which are the first to be exploited for solving the pretext task may also be the most vulnerable to an adversary trained to make the task harder. We show that this assumption holds across common pretext tasks and datasets by training a "lens" network to make small image changes that maximally reduce performance in the pretext task. Representations learned with the modified images outperform those learned without in all tested cases. Additionally, the modifications made by the lens reveal how the choice of pretext task and dataset affects the features learned by self-supervision.
APA
Minderer, M., Bachem, O., Houlsby, N. & Tschannen, M.. (2020). Automatic Shortcut Removal for Self-Supervised Representation Learning. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:6927-6937 Available from https://proceedings.mlr.press/v119/minderer20a.html.

Related Material