Fast Adaptation with Linearized Neural Networks

Wesley Maddox, Shuai Tang, Pablo Moreno, Andrew Gordon Wilson, Andreas Damianou
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:2737-2745, 2021.

Abstract

The inductive biases of trained neural networks are difficult to understand and, consequently, to adapt to new settings. We study the inductive biases of linearizations of neural networks, which we show to be surprisingly good summaries of the full network functions. Inspired by this finding, we propose a technique for embedding these inductive biases into Gaussian processes through a kernel designed from the Jacobian of the network. In this setting, domain adaptation takes the form of interpretable posterior inference, with accompanying uncertainty estimation. This inference is analytic and free of local optima issues found in standard techniques such as fine-tuning neural network weights to a new task. We develop significant computational speed-ups based on matrix multiplies, including a novel implementation for scalable Fisher vector products. Our experiments on both image classification and regression demonstrate the promise and convenience of this framework for transfer learning, compared to neural network fine-tuning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-maddox21a, title = { Fast Adaptation with Linearized Neural Networks }, author = {Maddox, Wesley and Tang, Shuai and Moreno, Pablo and Gordon Wilson, Andrew and Damianou, Andreas}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {2737--2745}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/maddox21a/maddox21a.pdf}, url = {https://proceedings.mlr.press/v130/maddox21a.html}, abstract = { The inductive biases of trained neural networks are difficult to understand and, consequently, to adapt to new settings. We study the inductive biases of linearizations of neural networks, which we show to be surprisingly good summaries of the full network functions. Inspired by this finding, we propose a technique for embedding these inductive biases into Gaussian processes through a kernel designed from the Jacobian of the network. In this setting, domain adaptation takes the form of interpretable posterior inference, with accompanying uncertainty estimation. This inference is analytic and free of local optima issues found in standard techniques such as fine-tuning neural network weights to a new task. We develop significant computational speed-ups based on matrix multiplies, including a novel implementation for scalable Fisher vector products. Our experiments on both image classification and regression demonstrate the promise and convenience of this framework for transfer learning, compared to neural network fine-tuning. } }
Endnote
%0 Conference Paper %T Fast Adaptation with Linearized Neural Networks %A Wesley Maddox %A Shuai Tang %A Pablo Moreno %A Andrew Gordon Wilson %A Andreas Damianou %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-maddox21a %I PMLR %P 2737--2745 %U https://proceedings.mlr.press/v130/maddox21a.html %V 130 %X The inductive biases of trained neural networks are difficult to understand and, consequently, to adapt to new settings. We study the inductive biases of linearizations of neural networks, which we show to be surprisingly good summaries of the full network functions. Inspired by this finding, we propose a technique for embedding these inductive biases into Gaussian processes through a kernel designed from the Jacobian of the network. In this setting, domain adaptation takes the form of interpretable posterior inference, with accompanying uncertainty estimation. This inference is analytic and free of local optima issues found in standard techniques such as fine-tuning neural network weights to a new task. We develop significant computational speed-ups based on matrix multiplies, including a novel implementation for scalable Fisher vector products. Our experiments on both image classification and regression demonstrate the promise and convenience of this framework for transfer learning, compared to neural network fine-tuning.
APA
Maddox, W., Tang, S., Moreno, P., Gordon Wilson, A. & Damianou, A.. (2021). Fast Adaptation with Linearized Neural Networks . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:2737-2745 Available from https://proceedings.mlr.press/v130/maddox21a.html.

Related Material