Network Morphism

Tao Wei; Changhu Wang; Yong Rui; Chang Wen Chen

Network Morphism

Tao Wei, Changhu Wang, Yong Rui, Chang Wen Chen

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:564-572, 2016.

Abstract

We present a systematic study on how to morph a well-trained neural network to a new one so that its network function can be completely preserved. We define this as network morphism in this research. After morphing a parent network, the child network is expected to inherit the knowledge from its parent network and also has the potential to continue growing into a more powerful one with much shortened training time. The first requirement for this network morphism is its ability to handle diverse morphing types of networks, including changes of depth, width, kernel size, and even subnet. To meet this requirement, we first introduce the network morphism equations, and then develop novel morphing algorithms for all these morphing types for both classic and convolutional neural networks. The second requirement is its ability to deal with non-linearity in a network. We propose a family of parametric-activation functions to facilitate the morphing of any continuous non-linear activation neurons. Experimental results on benchmark datasets and typical neural networks demonstrate the effectiveness of the proposed network morphism scheme.

Cite this Paper

BibTeX


@InProceedings{pmlr-v48-wei16,
  title = 	 {Network Morphism},
  author = 	 {Wei, Tao and Wang, Changhu and Rui, Yong and Chen, Chang Wen},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {564--572},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/wei16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/wei16.html},
  abstract = 	 {We present a systematic study on how to morph a well-trained neural network to a new one so that its network function can be completely preserved. We define this as network morphism in this research. After morphing a parent network, the child network is expected to inherit the knowledge from its parent network and also has the potential to continue growing into a more powerful one with much shortened training time. The first requirement for this network morphism is its ability to handle diverse morphing types of networks, including changes of depth, width, kernel size, and even subnet. To meet this requirement, we first introduce the network morphism equations, and then develop novel morphing algorithms for all these morphing types for both classic and convolutional neural networks. The second requirement is its ability to deal with non-linearity in a network. We propose a family of parametric-activation functions to facilitate the morphing of any continuous non-linear activation neurons. Experimental results on benchmark datasets and typical neural networks demonstrate the effectiveness of the proposed network morphism scheme.}
}

Endnote

%0 Conference Paper
%T Network Morphism
%A Tao Wei
%A Changhu Wang
%A Yong Rui
%A Chang Wen Chen
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-wei16
%I PMLR
%P 564--572
%U https://proceedings.mlr.press/v48/wei16.html
%V 48
%X We present a systematic study on how to morph a well-trained neural network to a new one so that its network function can be completely preserved. We define this as network morphism in this research. After morphing a parent network, the child network is expected to inherit the knowledge from its parent network and also has the potential to continue growing into a more powerful one with much shortened training time. The first requirement for this network morphism is its ability to handle diverse morphing types of networks, including changes of depth, width, kernel size, and even subnet. To meet this requirement, we first introduce the network morphism equations, and then develop novel morphing algorithms for all these morphing types for both classic and convolutional neural networks. The second requirement is its ability to deal with non-linearity in a network. We propose a family of parametric-activation functions to facilitate the morphing of any continuous non-linear activation neurons. Experimental results on benchmark datasets and typical neural networks demonstrate the effectiveness of the proposed network morphism scheme.

RIS


TY  - CPAPER
TI  - Network Morphism
AU  - Tao Wei
AU  - Changhu Wang
AU  - Yong Rui
AU  - Chang Wen Chen
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-wei16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 564
EP  - 572
L1  - http://proceedings.mlr.press/v48/wei16.pdf
UR  - https://proceedings.mlr.press/v48/wei16.html
AB  - We present a systematic study on how to morph a well-trained neural network to a new one so that its network function can be completely preserved. We define this as network morphism in this research. After morphing a parent network, the child network is expected to inherit the knowledge from its parent network and also has the potential to continue growing into a more powerful one with much shortened training time. The first requirement for this network morphism is its ability to handle diverse morphing types of networks, including changes of depth, width, kernel size, and even subnet. To meet this requirement, we first introduce the network morphism equations, and then develop novel morphing algorithms for all these morphing types for both classic and convolutional neural networks. The second requirement is its ability to deal with non-linearity in a network. We propose a family of parametric-activation functions to facilitate the morphing of any continuous non-linear activation neurons. Experimental results on benchmark datasets and typical neural networks demonstrate the effectiveness of the proposed network morphism scheme.
ER  -

APA


Wei, T., Wang, C., Rui, Y. & Chen, C.W.. (2016). Network Morphism. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:564-572 Available from https://proceedings.mlr.press/v48/wei16.html.

Related Material

Download PDF