Associative Convolutional Layers

Hamed Omidvar; Vahideh Akhlaghi; Hao Su; Massimo Franceschetti; Rajesh Gupta

Associative Convolutional Layers

Hamed Omidvar, Vahideh Akhlaghi, Hao Su, Massimo Franceschetti, Rajesh Gupta

Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:3115-3123, 2021.

Abstract

We provide a general and easy to implement method for reducing the number of parameters of Convolutional Neural Networks (CNNs) during the training and inference phases. We introduce a simple trainable auxiliary neural network which can generate approximate versions of “slices” of the sets of convolutional filters of any CNN architecture from a low dimensional “code” space. These slices are then concatenated to form the sets of filters in the CNN architecture. The auxiliary neural network, which we call “Convolutional Slice Generator” (CSG), is unique to the network and provides the association among its convolutional layers. We apply our method to various CNN architectures including ResNet, DenseNet, MobileNet and ShuffleNet. Experiments on CIFAR-10 and ImageNet-1000, without any hyper-parameter tuning, show that our approach reduces the network parameters by approximately $2\times$ while the reduction in accuracy is confined to within one percent and sometimes the accuracy even improves after compression. Interestingly, through our experiments, we show that even when the CSG takes random binary values for its weights that are not learned, still acceptable performances are achieved. To show that our approach generalizes to other tasks, we apply it to an image segmentation architecture, Deeplab V3, on the Pascal VOC 2012 dataset. Results show that without any parameter tuning, there is $\approx 2.3\times$ parameter reduction and the mean Intersection over Union (mIoU) drops by $\approx 3%$. Finally, we provide comparisons with several related methods showing the superiority of our method in terms of accuracy.

Cite this Paper

BibTeX


@InProceedings{pmlr-v130-omidvar21a,
  title = 	 { Associative Convolutional Layers },
  author =       {Omidvar, Hamed and Akhlaghi, Vahideh and Su, Hao and Franceschetti, Massimo and Gupta, Rajesh},
  booktitle = 	 {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {3115--3123},
  year = 	 {2021},
  editor = 	 {Banerjee, Arindam and Fukumizu, Kenji},
  volume = 	 {130},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--15 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v130/omidvar21a/omidvar21a.pdf},
  url = 	 {https://proceedings.mlr.press/v130/omidvar21a.html},
  abstract = 	 { We provide a general and easy to implement method for reducing the number of parameters of Convolutional Neural Networks (CNNs) during the training and inference phases. We introduce a simple trainable auxiliary neural network which can generate approximate versions of “slices” of the sets of convolutional filters of any CNN architecture from a low dimensional “code” space. These slices are then concatenated to form the sets of filters in the CNN architecture. The auxiliary neural network, which we call “Convolutional Slice Generator” (CSG), is unique to the network and provides the association among its convolutional layers. We apply our method to various CNN architectures including ResNet, DenseNet, MobileNet and ShuffleNet. Experiments on CIFAR-10 and ImageNet-1000, without any hyper-parameter tuning, show that our approach reduces the network parameters by approximately $2\times$ while the reduction in accuracy is confined to within one percent and sometimes the accuracy even improves after compression. Interestingly, through our experiments, we show that even when the CSG takes random binary values for its weights that are not learned, still acceptable performances are achieved. To show that our approach generalizes to other tasks, we apply it to an image segmentation architecture, Deeplab V3, on the Pascal VOC 2012 dataset. Results show that without any parameter tuning, there is $\approx 2.3\times$ parameter reduction and the mean Intersection over Union (mIoU) drops by $\approx 3%$. Finally, we provide comparisons with several related methods showing the superiority of our method in terms of accuracy. }
}

Endnote

%0 Conference Paper
%T  Associative Convolutional Layers 
%A Hamed Omidvar
%A Vahideh Akhlaghi
%A Hao Su
%A Massimo Franceschetti
%A Rajesh Gupta
%B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2021
%E Arindam Banerjee
%E Kenji Fukumizu	
%F pmlr-v130-omidvar21a
%I PMLR
%P 3115--3123
%U https://proceedings.mlr.press/v130/omidvar21a.html
%V 130
%X  We provide a general and easy to implement method for reducing the number of parameters of Convolutional Neural Networks (CNNs) during the training and inference phases. We introduce a simple trainable auxiliary neural network which can generate approximate versions of “slices” of the sets of convolutional filters of any CNN architecture from a low dimensional “code” space. These slices are then concatenated to form the sets of filters in the CNN architecture. The auxiliary neural network, which we call “Convolutional Slice Generator” (CSG), is unique to the network and provides the association among its convolutional layers. We apply our method to various CNN architectures including ResNet, DenseNet, MobileNet and ShuffleNet. Experiments on CIFAR-10 and ImageNet-1000, without any hyper-parameter tuning, show that our approach reduces the network parameters by approximately $2\times$ while the reduction in accuracy is confined to within one percent and sometimes the accuracy even improves after compression. Interestingly, through our experiments, we show that even when the CSG takes random binary values for its weights that are not learned, still acceptable performances are achieved. To show that our approach generalizes to other tasks, we apply it to an image segmentation architecture, Deeplab V3, on the Pascal VOC 2012 dataset. Results show that without any parameter tuning, there is $\approx 2.3\times$ parameter reduction and the mean Intersection over Union (mIoU) drops by $\approx 3%$. Finally, we provide comparisons with several related methods showing the superiority of our method in terms of accuracy.

APA


Omidvar, H., Akhlaghi, V., Su, H., Franceschetti, M. & Gupta, R.. (2021).  Associative Convolutional Layers . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:3115-3123 Available from https://proceedings.mlr.press/v130/omidvar21a.html.

Associative Convolutional Layers

Abstract

Cite this Paper

Related Material