TaskNorm: Rethinking Batch Normalization for Meta-Learning

John Bronskill; Jonathan Gordon; James Requeima; Sebastian Nowozin; Richard Turner

TaskNorm: Rethinking Batch Normalization for Meta-Learning

John Bronskill, Jonathan Gordon, James Requeima, Sebastian Nowozin, Richard Turner

Proceedings of the 37th International Conference on Machine Learning, PMLR 119:1153-1164, 2020.

Abstract

Modern meta-learning approaches for image classification rely on increasingly deep networks to achieve state-of-the-art performance, making batch normalization an essential component of meta-learning pipelines. However, the hierarchical nature of the meta-learning setting presents several challenges that can render conventional batch normalization ineffective, giving rise to the need to rethink normalization in this setting. We evaluate a range of approaches to batch normalization for meta-learning scenarios, and develop a novel approach that we call TaskNorm. Experiments on fourteen datasets demonstrate that the choice of batch normalization has a dramatic effect on both classification accuracy and training time for both gradient based- and gradient-free meta-learning approaches. Importantly, TaskNorm is found to consistently improve performance. Finally, we provide a set of best practices for normalization that will allow fair comparison of meta-learning algorithms.

Cite this Paper

BibTeX


@InProceedings{pmlr-v119-bronskill20a,
  title = 	 {{T}ask{N}orm: Rethinking Batch Normalization for Meta-Learning},
  author =       {Bronskill, John and Gordon, Jonathan and Requeima, James and Nowozin, Sebastian and Turner, Richard},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  pages = 	 {1153--1164},
  year = 	 {2020},
  editor = 	 {III, Hal Daumé and Singh, Aarti},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--18 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v119/bronskill20a/bronskill20a.pdf},
  url = 	 {https://proceedings.mlr.press/v119/bronskill20a.html},
  abstract = 	 {Modern meta-learning approaches for image classification rely on increasingly deep networks to achieve state-of-the-art performance, making batch normalization an essential component of meta-learning pipelines. However, the hierarchical nature of the meta-learning setting presents several challenges that can render conventional batch normalization ineffective, giving rise to the need to rethink normalization in this setting. We evaluate a range of approaches to batch normalization for meta-learning scenarios, and develop a novel approach that we call TaskNorm. Experiments on fourteen datasets demonstrate that the choice of batch normalization has a dramatic effect on both classification accuracy and training time for both gradient based- and gradient-free meta-learning approaches. Importantly, TaskNorm is found to consistently improve performance. Finally, we provide a set of best practices for normalization that will allow fair comparison of meta-learning algorithms.}
}

Endnote

%0 Conference Paper
%T TaskNorm: Rethinking Batch Normalization for Meta-Learning
%A John Bronskill
%A Jonathan Gordon
%A James Requeima
%A Sebastian Nowozin
%A Richard Turner
%B Proceedings of the 37th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Hal Daumé III
%E Aarti Singh	
%F pmlr-v119-bronskill20a
%I PMLR
%P 1153--1164
%U https://proceedings.mlr.press/v119/bronskill20a.html
%V 119
%X Modern meta-learning approaches for image classification rely on increasingly deep networks to achieve state-of-the-art performance, making batch normalization an essential component of meta-learning pipelines. However, the hierarchical nature of the meta-learning setting presents several challenges that can render conventional batch normalization ineffective, giving rise to the need to rethink normalization in this setting. We evaluate a range of approaches to batch normalization for meta-learning scenarios, and develop a novel approach that we call TaskNorm. Experiments on fourteen datasets demonstrate that the choice of batch normalization has a dramatic effect on both classification accuracy and training time for both gradient based- and gradient-free meta-learning approaches. Importantly, TaskNorm is found to consistently improve performance. Finally, we provide a set of best practices for normalization that will allow fair comparison of meta-learning algorithms.

APA


Bronskill, J., Gordon, J., Requeima, J., Nowozin, S. & Turner, R.. (2020). TaskNorm: Rethinking Batch Normalization for Meta-Learning. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:1153-1164 Available from https://proceedings.mlr.press/v119/bronskill20a.html.

TaskNorm: Rethinking Batch Normalization for Meta-Learning

Abstract

Cite this Paper

Related Material