Limits of Model Selection under Transfer Learning

Steve Hanneke, Samory Kpotufe, Yasaman Mahdaviyeh
Proceedings of Thirty Sixth Conference on Learning Theory, PMLR 195:5781-5812, 2023.

Abstract

Theoretical studies on \emph{transfer learning} (or \emph{domain adaptation}) have so far focused on situations with a known hypothesis class or \emph{model}; however in practice, some amount of model selection is usually involved, often appearing under the umbrella term or \emph{hyperparameter-tuning}: for example, one may think of the problem of \emph{tuning} for the right neural network architecture towards a target task, while leveraging data from a related \emph{source} task. Now, in addition to the usual tradeoffs on approximation vs estimation errors involved in model selection, this problem brings in a new complexity term, namely, the \emph{transfer distance} between source and target distributions, which is known to vary with the choice of hypothesis class. We present a first study of this problem, focusing on classification; in particular, the analysis reveals some remarkable phenomena: \emph{adaptive rates}, i.e., those achievable with no distributional information, can be arbitrarily slower than \emph{oracle rates}, i.e., when given knowledge on \emph{distances}

Cite this Paper


BibTeX
@InProceedings{pmlr-v195-hanneke23c, title = {Limits of Model Selection under Transfer Learning}, author = {Hanneke, Steve and Kpotufe, Samory and Mahdaviyeh, Yasaman}, booktitle = {Proceedings of Thirty Sixth Conference on Learning Theory}, pages = {5781--5812}, year = {2023}, editor = {Neu, Gergely and Rosasco, Lorenzo}, volume = {195}, series = {Proceedings of Machine Learning Research}, month = {12--15 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v195/hanneke23c/hanneke23c.pdf}, url = {https://proceedings.mlr.press/v195/hanneke23c.html}, abstract = {Theoretical studies on \emph{transfer learning} (or \emph{domain adaptation}) have so far focused on situations with a known hypothesis class or \emph{model}; however in practice, some amount of model selection is usually involved, often appearing under the umbrella term or \emph{hyperparameter-tuning}: for example, one may think of the problem of \emph{tuning} for the right neural network architecture towards a target task, while leveraging data from a related \emph{source} task. Now, in addition to the usual tradeoffs on approximation vs estimation errors involved in model selection, this problem brings in a new complexity term, namely, the \emph{transfer distance} between source and target distributions, which is known to vary with the choice of hypothesis class. We present a first study of this problem, focusing on classification; in particular, the analysis reveals some remarkable phenomena: \emph{adaptive rates}, i.e., those achievable with no distributional information, can be arbitrarily slower than \emph{oracle rates}, i.e., when given knowledge on \emph{distances}} }
Endnote
%0 Conference Paper %T Limits of Model Selection under Transfer Learning %A Steve Hanneke %A Samory Kpotufe %A Yasaman Mahdaviyeh %B Proceedings of Thirty Sixth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2023 %E Gergely Neu %E Lorenzo Rosasco %F pmlr-v195-hanneke23c %I PMLR %P 5781--5812 %U https://proceedings.mlr.press/v195/hanneke23c.html %V 195 %X Theoretical studies on \emph{transfer learning} (or \emph{domain adaptation}) have so far focused on situations with a known hypothesis class or \emph{model}; however in practice, some amount of model selection is usually involved, often appearing under the umbrella term or \emph{hyperparameter-tuning}: for example, one may think of the problem of \emph{tuning} for the right neural network architecture towards a target task, while leveraging data from a related \emph{source} task. Now, in addition to the usual tradeoffs on approximation vs estimation errors involved in model selection, this problem brings in a new complexity term, namely, the \emph{transfer distance} between source and target distributions, which is known to vary with the choice of hypothesis class. We present a first study of this problem, focusing on classification; in particular, the analysis reveals some remarkable phenomena: \emph{adaptive rates}, i.e., those achievable with no distributional information, can be arbitrarily slower than \emph{oracle rates}, i.e., when given knowledge on \emph{distances}
APA
Hanneke, S., Kpotufe, S. & Mahdaviyeh, Y.. (2023). Limits of Model Selection under Transfer Learning. Proceedings of Thirty Sixth Conference on Learning Theory, in Proceedings of Machine Learning Research 195:5781-5812 Available from https://proceedings.mlr.press/v195/hanneke23c.html.

Related Material