Adaptive wavelet pooling for convolutional neural networks
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:1936-1944, 2021.
Convolutional neural networks (CNN)s have become the go-to choice for most image and video processing tasks. Most CNN architectures rely on pooling layers to reduce the resolution along spatial dimensions. The reduction allows subsequent deep convolution layers to operate with greater efficiency. This paper introduces adaptive wavelet pooling layers, which employ fast wavelet transforms (FWT) to reduce the feature resolution. The FWT decomposes the input features into multiple scales reducing the feature dimensions by removing the fine-scale subbands. Our approach adds extra flexibility through wavelet-basis function optimization and coefficient weighting at different scales. The adaptive wavelet layers integrate directly into well-known CNNs like the LeNet, Alexnet, or Densenet architectures. Using these networks, we validate our approach and find competitive performance on the MNIST, CIFAR10, and SVHN (street view house numbers) data-sets.