SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization

Juyong Kim, Yookoon Park, Gunhee Kim, Sung Ju Hwang
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:1866-1874, 2017.

Abstract

We propose a novel deep neural network that is both lightweight and effectively structured for model parallelization. Our network, which we name as SplitNet, automatically learns to split the network weights into either a set or a hierarchy of multiple groups that use disjoint sets of features, by learning both the class-to-group and feature-to-group assignment matrices along with the network weights. This produces a tree-structured network that involves no connection between branched subtrees of semantically disparate class groups. SplitNet thus greatly reduces the number of parameters and requires significantly less computations, and is also embarrassingly model parallelizable at test time, since the network evaluation for each subnetwork is completely independent except for the shared lower layer weights that can be duplicated over multiple processors. We validate our method with two deep network models (ResNet and AlexNet) on two different datasets (CIFAR-100 and ILSVRC 2012) for image classification, on which our method obtains networks with significantly reduced number of parameters while achieving comparable or superior classification accuracies over original full deep networks, and accelerated test speed with multiple GPUs.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-kim17b, title = {{S}plit{N}et: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization}, author = {Juyong Kim and Yookoon Park and Gunhee Kim and Sung Ju Hwang}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {1866--1874}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/kim17b/kim17b.pdf}, url = {https://proceedings.mlr.press/v70/kim17b.html}, abstract = {We propose a novel deep neural network that is both lightweight and effectively structured for model parallelization. Our network, which we name as SplitNet, automatically learns to split the network weights into either a set or a hierarchy of multiple groups that use disjoint sets of features, by learning both the class-to-group and feature-to-group assignment matrices along with the network weights. This produces a tree-structured network that involves no connection between branched subtrees of semantically disparate class groups. SplitNet thus greatly reduces the number of parameters and requires significantly less computations, and is also embarrassingly model parallelizable at test time, since the network evaluation for each subnetwork is completely independent except for the shared lower layer weights that can be duplicated over multiple processors. We validate our method with two deep network models (ResNet and AlexNet) on two different datasets (CIFAR-100 and ILSVRC 2012) for image classification, on which our method obtains networks with significantly reduced number of parameters while achieving comparable or superior classification accuracies over original full deep networks, and accelerated test speed with multiple GPUs.} }
Endnote
%0 Conference Paper %T SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization %A Juyong Kim %A Yookoon Park %A Gunhee Kim %A Sung Ju Hwang %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-kim17b %I PMLR %P 1866--1874 %U https://proceedings.mlr.press/v70/kim17b.html %V 70 %X We propose a novel deep neural network that is both lightweight and effectively structured for model parallelization. Our network, which we name as SplitNet, automatically learns to split the network weights into either a set or a hierarchy of multiple groups that use disjoint sets of features, by learning both the class-to-group and feature-to-group assignment matrices along with the network weights. This produces a tree-structured network that involves no connection between branched subtrees of semantically disparate class groups. SplitNet thus greatly reduces the number of parameters and requires significantly less computations, and is also embarrassingly model parallelizable at test time, since the network evaluation for each subnetwork is completely independent except for the shared lower layer weights that can be duplicated over multiple processors. We validate our method with two deep network models (ResNet and AlexNet) on two different datasets (CIFAR-100 and ILSVRC 2012) for image classification, on which our method obtains networks with significantly reduced number of parameters while achieving comparable or superior classification accuracies over original full deep networks, and accelerated test speed with multiple GPUs.
APA
Kim, J., Park, Y., Kim, G. & Hwang, S.J.. (2017). SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:1866-1874 Available from https://proceedings.mlr.press/v70/kim17b.html.

Related Material