[edit]
When is Transfer Learning Possible?
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:40642-40666, 2024.
Abstract
We present a general framework for transfer learning that is flexible enough to capture transfer in supervised, reinforcement, and imitation learning. Our framework enables new insights into the fundamental question of when we can successfully transfer learned information across problems. We model the learner as interacting with a sequence of problem instances, or environments, each of which is generated from a common structural causal model (SCM) by choosing the SCM’s parameters from restricted sets. We derive a procedure that can propagate restrictions on SCM parameters through the SCM’s graph structure to other parameters that we are trying to learn. The propagated restrictions then enable more efficient learning (i.e., transfer). By analyzing the procedure, we are able to challenge widely-held beliefs about transfer learning. First, we show that having sparse changes across environments is neither necessary nor sufficient for transfer. Second, we show an example where the common heuristic of freezing a layer in a network causes poor transfer performance. We then use our procedure to select a more refined set of parameters to freeze, leading to successful transfer learning.