Bank of Weight Filters for Deep CNNs

Suresh Kirthi Kumaraswamy; PS Sastry; Kalpathi Ramakrishnan

Bank of Weight Filters for Deep CNNs

Suresh Kirthi Kumaraswamy, PS Sastry, Kalpathi Ramakrishnan

Proceedings of The 8th Asian Conference on Machine Learning, PMLR 63:334-349, 2016.

Abstract

Convolutional neural networks (CNNs) are seen to be extremely effective in many large object recognition tasks. One of the reasons for this is that they learn appropriate features also from the training data. The convolutional layers of a CNN have these feature generating filters whose weights are learnt. However, this entails learning millions of weights (across different layers) and hence learning times are very large even on the best available hardware. In some studies in transfer learning it has been observed that the network learnt on one task can be reused on another task (by some finetuning). In this context, this paper presents a systematic study of the exchangeability of weight filters of CNNs across different object recognition tasks. The paper proposes the concept of bank of weight-filters (BWF) which consists of all the weight vectors of filters learnt by different CNNs on different tasks. The BWF can be viewed at multiple levels of granularity such as network-level, layer-level and filter-level. Through extensive empirical investigations we show that one can efficiently learn CNNs for new tasks by randomly selecting from the bank of filters for initializing the convolutional layers of the new CNN. Our study is done at all the multiple levels of granularity mentioned above. Our results show that the concept of BWF proposed here would offer a very good strategy for initializing the filters while learning CNNs. We also show that the dependency among the filters and the layers of the CNN is not strict. One can choose any pre-trained filter instead of a fixed pre-trained net, as a whole, for initialization. This paper is a first step in the direction of creating and characterizing a Universal BWF for efficient learning of CNNs.

Cite this Paper

BibTeX


@InProceedings{pmlr-v63-kumaraswamy29,
  title = 	 {Bank of Weight Filters for Deep CNNs},
  author = 	 {Kumaraswamy, Suresh Kirthi and Sastry, PS and Ramakrishnan, Kalpathi},
  booktitle = 	 {Proceedings of The 8th Asian Conference on Machine Learning},
  pages = 	 {334--349},
  year = 	 {2016},
  editor = 	 {Durrant, Robert J. and Kim, Kee-Eung},
  volume = 	 {63},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {The University of Waikato, Hamilton, New Zealand},
  month = 	 {16--18 Nov},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v63/kumaraswamy29.pdf},
  url = 	 {https://proceedings.mlr.press/v63/kumaraswamy29.html},
  abstract = 	 {Convolutional neural networks (CNNs) are seen to be extremely effective in many large object recognition tasks. One of the reasons for this is that they learn appropriate features also from the training data. The convolutional layers of a CNN have these feature generating filters whose weights are learnt. However, this entails learning millions of weights (across different layers) and hence learning times are very large even on the best available hardware. In some studies in transfer learning it has been observed that the network learnt on one task can be reused on another task (by some finetuning). In this context, this paper presents a systematic study of the exchangeability of weight filters of CNNs across different object recognition tasks. The paper proposes the concept of bank of weight-filters (BWF) which consists of all the weight vectors of filters learnt by different CNNs on different tasks. The BWF can be viewed at multiple levels of granularity such as network-level, layer-level and filter-level. Through extensive empirical investigations we show that one can efficiently learn CNNs for new tasks by randomly selecting from the bank of filters for initializing the convolutional layers of the new CNN. Our study is done at all the multiple levels of granularity mentioned above. Our results show that the concept of BWF proposed here would offer a very good strategy for initializing the filters while learning CNNs. We also show that the dependency among the filters and the layers of the CNN is not strict. One can choose any pre-trained filter instead of a fixed pre-trained net, as a whole, for initialization. This paper is a first step in the direction of creating and characterizing a Universal BWF for efficient learning of CNNs.}
}

Endnote

%0 Conference Paper
%T Bank of Weight Filters for Deep CNNs
%A Suresh Kirthi Kumaraswamy
%A PS Sastry
%A Kalpathi Ramakrishnan
%B Proceedings of The 8th Asian Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Robert J. Durrant
%E Kee-Eung Kim	
%F pmlr-v63-kumaraswamy29
%I PMLR
%P 334--349
%U https://proceedings.mlr.press/v63/kumaraswamy29.html
%V 63
%X Convolutional neural networks (CNNs) are seen to be extremely effective in many large object recognition tasks. One of the reasons for this is that they learn appropriate features also from the training data. The convolutional layers of a CNN have these feature generating filters whose weights are learnt. However, this entails learning millions of weights (across different layers) and hence learning times are very large even on the best available hardware. In some studies in transfer learning it has been observed that the network learnt on one task can be reused on another task (by some finetuning). In this context, this paper presents a systematic study of the exchangeability of weight filters of CNNs across different object recognition tasks. The paper proposes the concept of bank of weight-filters (BWF) which consists of all the weight vectors of filters learnt by different CNNs on different tasks. The BWF can be viewed at multiple levels of granularity such as network-level, layer-level and filter-level. Through extensive empirical investigations we show that one can efficiently learn CNNs for new tasks by randomly selecting from the bank of filters for initializing the convolutional layers of the new CNN. Our study is done at all the multiple levels of granularity mentioned above. Our results show that the concept of BWF proposed here would offer a very good strategy for initializing the filters while learning CNNs. We also show that the dependency among the filters and the layers of the CNN is not strict. One can choose any pre-trained filter instead of a fixed pre-trained net, as a whole, for initialization. This paper is a first step in the direction of creating and characterizing a Universal BWF for efficient learning of CNNs.

RIS


TY  - CPAPER
TI  - Bank of Weight Filters for Deep CNNs
AU  - Suresh Kirthi Kumaraswamy
AU  - PS Sastry
AU  - Kalpathi Ramakrishnan
BT  - Proceedings of The 8th Asian Conference on Machine Learning
DA  - 2016/11/20
ED  - Robert J. Durrant
ED  - Kee-Eung Kim	
ID  - pmlr-v63-kumaraswamy29
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 63
SP  - 334
EP  - 349
L1  - http://proceedings.mlr.press/v63/kumaraswamy29.pdf
UR  - https://proceedings.mlr.press/v63/kumaraswamy29.html
AB  - Convolutional neural networks (CNNs) are seen to be extremely effective in many large object recognition tasks. One of the reasons for this is that they learn appropriate features also from the training data. The convolutional layers of a CNN have these feature generating filters whose weights are learnt. However, this entails learning millions of weights (across different layers) and hence learning times are very large even on the best available hardware. In some studies in transfer learning it has been observed that the network learnt on one task can be reused on another task (by some finetuning). In this context, this paper presents a systematic study of the exchangeability of weight filters of CNNs across different object recognition tasks. The paper proposes the concept of bank of weight-filters (BWF) which consists of all the weight vectors of filters learnt by different CNNs on different tasks. The BWF can be viewed at multiple levels of granularity such as network-level, layer-level and filter-level. Through extensive empirical investigations we show that one can efficiently learn CNNs for new tasks by randomly selecting from the bank of filters for initializing the convolutional layers of the new CNN. Our study is done at all the multiple levels of granularity mentioned above. Our results show that the concept of BWF proposed here would offer a very good strategy for initializing the filters while learning CNNs. We also show that the dependency among the filters and the layers of the CNN is not strict. One can choose any pre-trained filter instead of a fixed pre-trained net, as a whole, for initialization. This paper is a first step in the direction of creating and characterizing a Universal BWF for efficient learning of CNNs.
ER  -

APA


Kumaraswamy, S.K., Sastry, P. & Ramakrishnan, K.. (2016). Bank of Weight Filters for Deep CNNs. Proceedings of The 8th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 63:334-349 Available from https://proceedings.mlr.press/v63/kumaraswamy29.html.

Related Material

Download PDF