Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning

Utku Evci, Vincent Dumoulin, Hugo Larochelle, Michael C Mozer
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:6009-6033, 2022.

Abstract

Transfer-learning methods aim to improve performance in a data-scarce target domain using a model pretrained on a data-rich source domain. A cost-efficient strategy, linear probing, involves freezing the source model and training a new classification head for the target domain. This strategy is outperformed by a more costly but state-of-the-art method – fine-tuning all parameters of the source model to the target domain – possibly because fine-tuning allows the model to leverage useful information from intermediate layers which is otherwise discarded by the later previously trained layers. We explore the hypothesis that these intermediate layers might be directly exploited. We propose a method, Head-to-Toe probing (Head2Toe), that selects features from all layers of the source model to train a classification head for the target-domain. In evaluations on the Visual Task Adaptation Benchmark-1k, Head2Toe matches performance obtained with fine-tuning on average while reducing training and storage cost hundred folds or more, but critically, for out-of-distribution transfer, Head2Toe outperforms fine-tuning. Code used in our experiments can be found in supplementary materials.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-evci22a, title = {{H}ead2{T}oe: Utilizing Intermediate Representations for Better Transfer Learning}, author = {Evci, Utku and Dumoulin, Vincent and Larochelle, Hugo and Mozer, Michael C}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {6009--6033}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/evci22a/evci22a.pdf}, url = {https://proceedings.mlr.press/v162/evci22a.html}, abstract = {Transfer-learning methods aim to improve performance in a data-scarce target domain using a model pretrained on a data-rich source domain. A cost-efficient strategy, linear probing, involves freezing the source model and training a new classification head for the target domain. This strategy is outperformed by a more costly but state-of-the-art method – fine-tuning all parameters of the source model to the target domain – possibly because fine-tuning allows the model to leverage useful information from intermediate layers which is otherwise discarded by the later previously trained layers. We explore the hypothesis that these intermediate layers might be directly exploited. We propose a method, Head-to-Toe probing (Head2Toe), that selects features from all layers of the source model to train a classification head for the target-domain. In evaluations on the Visual Task Adaptation Benchmark-1k, Head2Toe matches performance obtained with fine-tuning on average while reducing training and storage cost hundred folds or more, but critically, for out-of-distribution transfer, Head2Toe outperforms fine-tuning. Code used in our experiments can be found in supplementary materials.} }
Endnote
%0 Conference Paper %T Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning %A Utku Evci %A Vincent Dumoulin %A Hugo Larochelle %A Michael C Mozer %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-evci22a %I PMLR %P 6009--6033 %U https://proceedings.mlr.press/v162/evci22a.html %V 162 %X Transfer-learning methods aim to improve performance in a data-scarce target domain using a model pretrained on a data-rich source domain. A cost-efficient strategy, linear probing, involves freezing the source model and training a new classification head for the target domain. This strategy is outperformed by a more costly but state-of-the-art method – fine-tuning all parameters of the source model to the target domain – possibly because fine-tuning allows the model to leverage useful information from intermediate layers which is otherwise discarded by the later previously trained layers. We explore the hypothesis that these intermediate layers might be directly exploited. We propose a method, Head-to-Toe probing (Head2Toe), that selects features from all layers of the source model to train a classification head for the target-domain. In evaluations on the Visual Task Adaptation Benchmark-1k, Head2Toe matches performance obtained with fine-tuning on average while reducing training and storage cost hundred folds or more, but critically, for out-of-distribution transfer, Head2Toe outperforms fine-tuning. Code used in our experiments can be found in supplementary materials.
APA
Evci, U., Dumoulin, V., Larochelle, H. & Mozer, M.C.. (2022). Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:6009-6033 Available from https://proceedings.mlr.press/v162/evci22a.html.

Related Material