Making Convolutional Networks Shift-Invariant Again

Richard Zhang

Making Convolutional Networks Shift-Invariant Again

Richard Zhang

Proceedings of the 36th International Conference on Machine Learning, PMLR 97:7324-7334, 2019.

Abstract

Modern convolutional networks are not shift-invariant, as small input shifts or translations can cause drastic changes in the output. Commonly used downsampling methods, such as max-pooling, strided-convolution, and average-pooling, ignore the sampling theorem. The well-known signal processing fix is anti-aliasing by low-pass filtering before downsampling. However, simply inserting this module into deep networks leads to performance degradation; as a result, it is seldomly used today. We show that when integrated correctly, it is compatible with existing architectural components, such as max-pooling. The technique is general and can be incorporated across layer types and applications, such as image classification and conditional image generation. In addition to increased shift-invariance, we also observe, surprisingly, that anti-aliasing boosts accuracy in ImageNet classification, across several commonly-used architectures. This indicates that anti-aliasing serves as effective regularization. Our results demonstrate that this classical signal processing technique has been undeservingly overlooked in modern deep networks.

Cite this Paper

BibTeX


@InProceedings{pmlr-v97-zhang19a,
  title = 	 {Making Convolutional Networks Shift-Invariant Again},
  author =       {Zhang, Richard},
  booktitle = 	 {Proceedings of the 36th International Conference on Machine Learning},
  pages = 	 {7324--7334},
  year = 	 {2019},
  editor = 	 {Chaudhuri, Kamalika and Salakhutdinov, Ruslan},
  volume = 	 {97},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--15 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v97/zhang19a/zhang19a.pdf},
  url = 	 {https://proceedings.mlr.press/v97/zhang19a.html},
  abstract = 	 {Modern convolutional networks are not shift-invariant, as small input shifts or translations can cause drastic changes in the output. Commonly used downsampling methods, such as max-pooling, strided-convolution, and average-pooling, ignore the sampling theorem. The well-known signal processing fix is anti-aliasing by low-pass filtering before downsampling. However, simply inserting this module into deep networks leads to performance degradation; as a result, it is seldomly used today. We show that when integrated correctly, it is compatible with existing architectural components, such as max-pooling. The technique is general and can be incorporated across layer types and applications, such as image classification and conditional image generation. In addition to increased shift-invariance, we also observe, surprisingly, that anti-aliasing boosts accuracy in ImageNet classification, across several commonly-used architectures. This indicates that anti-aliasing serves as effective regularization. Our results demonstrate that this classical signal processing technique has been undeservingly overlooked in modern deep networks.}
}

Endnote

%0 Conference Paper
%T Making Convolutional Networks Shift-Invariant Again
%A Richard Zhang
%B Proceedings of the 36th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2019
%E Kamalika Chaudhuri
%E Ruslan Salakhutdinov	
%F pmlr-v97-zhang19a
%I PMLR
%P 7324--7334
%U https://proceedings.mlr.press/v97/zhang19a.html
%V 97
%X Modern convolutional networks are not shift-invariant, as small input shifts or translations can cause drastic changes in the output. Commonly used downsampling methods, such as max-pooling, strided-convolution, and average-pooling, ignore the sampling theorem. The well-known signal processing fix is anti-aliasing by low-pass filtering before downsampling. However, simply inserting this module into deep networks leads to performance degradation; as a result, it is seldomly used today. We show that when integrated correctly, it is compatible with existing architectural components, such as max-pooling. The technique is general and can be incorporated across layer types and applications, such as image classification and conditional image generation. In addition to increased shift-invariance, we also observe, surprisingly, that anti-aliasing boosts accuracy in ImageNet classification, across several commonly-used architectures. This indicates that anti-aliasing serves as effective regularization. Our results demonstrate that this classical signal processing technique has been undeservingly overlooked in modern deep networks.

APA


Zhang, R.. (2019). Making Convolutional Networks Shift-Invariant Again. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:7324-7334 Available from https://proceedings.mlr.press/v97/zhang19a.html.

Making Convolutional Networks Shift-Invariant Again

Abstract

Cite this Paper

Related Material