Risk Bounds for Transferring Representations With and Without Fine-Tuning

Daniel McNamara, Maria-Florina Balcan
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:2373-2381, 2017.

Abstract

A popular machine learning strategy is the transfer of a representation (i.e. a feature extraction function) learned on a source task to a target task. Examples include the re-use of neural network weights or word embeddings. We develop sufficient conditions for the success of this approach. If the representation learned from the source task is fixed, we identify conditions on how the tasks relate to obtain an upper bound on target task risk via a VC dimension-based argument. We then consider using the representation from the source task to construct a prior, which is fine-tuned using target task data. We give a PAC-Bayes target task risk bound in this setting under suitable conditions. We show examples of our bounds using feedforward neural networks. Our results motivate a practical approach to weight transfer, which we validate with experiments.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-mcnamara17a, title = {Risk Bounds for Transferring Representations With and Without Fine-Tuning}, author = {Daniel McNamara and Maria-Florina Balcan}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {2373--2381}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/mcnamara17a/mcnamara17a.pdf}, url = {https://proceedings.mlr.press/v70/mcnamara17a.html}, abstract = {A popular machine learning strategy is the transfer of a representation (i.e. a feature extraction function) learned on a source task to a target task. Examples include the re-use of neural network weights or word embeddings. We develop sufficient conditions for the success of this approach. If the representation learned from the source task is fixed, we identify conditions on how the tasks relate to obtain an upper bound on target task risk via a VC dimension-based argument. We then consider using the representation from the source task to construct a prior, which is fine-tuned using target task data. We give a PAC-Bayes target task risk bound in this setting under suitable conditions. We show examples of our bounds using feedforward neural networks. Our results motivate a practical approach to weight transfer, which we validate with experiments.} }
Endnote
%0 Conference Paper %T Risk Bounds for Transferring Representations With and Without Fine-Tuning %A Daniel McNamara %A Maria-Florina Balcan %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-mcnamara17a %I PMLR %P 2373--2381 %U https://proceedings.mlr.press/v70/mcnamara17a.html %V 70 %X A popular machine learning strategy is the transfer of a representation (i.e. a feature extraction function) learned on a source task to a target task. Examples include the re-use of neural network weights or word embeddings. We develop sufficient conditions for the success of this approach. If the representation learned from the source task is fixed, we identify conditions on how the tasks relate to obtain an upper bound on target task risk via a VC dimension-based argument. We then consider using the representation from the source task to construct a prior, which is fine-tuned using target task data. We give a PAC-Bayes target task risk bound in this setting under suitable conditions. We show examples of our bounds using feedforward neural networks. Our results motivate a practical approach to weight transfer, which we validate with experiments.
APA
McNamara, D. & Balcan, M.. (2017). Risk Bounds for Transferring Representations With and Without Fine-Tuning. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:2373-2381 Available from https://proceedings.mlr.press/v70/mcnamara17a.html.

Related Material