Dynamic Capacity Networks

Amjad Almahairi, Nicolas Ballas, Tim Cooijmans, Yin Zheng, Hugo Larochelle, Aaron Courville
Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:2549-2558, 2016.

Abstract

We introduce the Dynamic Capacity Network (DCN), a neural network that can adaptively assign its capacity across different portions of the input data. This is achieved by combining modules of two types: low-capacity sub-networks and high-capacity sub-networks. The low-capacity sub-networks are applied across most of the input, but also provide a guide to select a few portions of the input on which to apply the high-capacity sub-networks. The selection is made using a novel gradient-based attention mechanism, that efficiently identifies input regions for which the DCN’s output is most sensitive and to which we should devote more capacity. We focus our empirical evaluation on the Cluttered MNIST and SVHN image datasets. Our findings indicate that DCNs are able to drastically reduce the number of computations, compared to traditional convolutional neural networks, while maintaining similar or even better performance.

Cite this Paper


BibTeX
@InProceedings{pmlr-v48-almahairi16, title = {Dynamic Capacity Networks}, author = {Almahairi, Amjad and Ballas, Nicolas and Cooijmans, Tim and Zheng, Yin and Larochelle, Hugo and Courville, Aaron}, booktitle = {Proceedings of The 33rd International Conference on Machine Learning}, pages = {2549--2558}, year = {2016}, editor = {Balcan, Maria Florina and Weinberger, Kilian Q.}, volume = {48}, series = {Proceedings of Machine Learning Research}, address = {New York, New York, USA}, month = {20--22 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v48/almahairi16.pdf}, url = {https://proceedings.mlr.press/v48/almahairi16.html}, abstract = {We introduce the Dynamic Capacity Network (DCN), a neural network that can adaptively assign its capacity across different portions of the input data. This is achieved by combining modules of two types: low-capacity sub-networks and high-capacity sub-networks. The low-capacity sub-networks are applied across most of the input, but also provide a guide to select a few portions of the input on which to apply the high-capacity sub-networks. The selection is made using a novel gradient-based attention mechanism, that efficiently identifies input regions for which the DCN’s output is most sensitive and to which we should devote more capacity. We focus our empirical evaluation on the Cluttered MNIST and SVHN image datasets. Our findings indicate that DCNs are able to drastically reduce the number of computations, compared to traditional convolutional neural networks, while maintaining similar or even better performance.} }
Endnote
%0 Conference Paper %T Dynamic Capacity Networks %A Amjad Almahairi %A Nicolas Ballas %A Tim Cooijmans %A Yin Zheng %A Hugo Larochelle %A Aaron Courville %B Proceedings of The 33rd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2016 %E Maria Florina Balcan %E Kilian Q. Weinberger %F pmlr-v48-almahairi16 %I PMLR %P 2549--2558 %U https://proceedings.mlr.press/v48/almahairi16.html %V 48 %X We introduce the Dynamic Capacity Network (DCN), a neural network that can adaptively assign its capacity across different portions of the input data. This is achieved by combining modules of two types: low-capacity sub-networks and high-capacity sub-networks. The low-capacity sub-networks are applied across most of the input, but also provide a guide to select a few portions of the input on which to apply the high-capacity sub-networks. The selection is made using a novel gradient-based attention mechanism, that efficiently identifies input regions for which the DCN’s output is most sensitive and to which we should devote more capacity. We focus our empirical evaluation on the Cluttered MNIST and SVHN image datasets. Our findings indicate that DCNs are able to drastically reduce the number of computations, compared to traditional convolutional neural networks, while maintaining similar or even better performance.
RIS
TY - CPAPER TI - Dynamic Capacity Networks AU - Amjad Almahairi AU - Nicolas Ballas AU - Tim Cooijmans AU - Yin Zheng AU - Hugo Larochelle AU - Aaron Courville BT - Proceedings of The 33rd International Conference on Machine Learning DA - 2016/06/11 ED - Maria Florina Balcan ED - Kilian Q. Weinberger ID - pmlr-v48-almahairi16 PB - PMLR DP - Proceedings of Machine Learning Research VL - 48 SP - 2549 EP - 2558 L1 - http://proceedings.mlr.press/v48/almahairi16.pdf UR - https://proceedings.mlr.press/v48/almahairi16.html AB - We introduce the Dynamic Capacity Network (DCN), a neural network that can adaptively assign its capacity across different portions of the input data. This is achieved by combining modules of two types: low-capacity sub-networks and high-capacity sub-networks. The low-capacity sub-networks are applied across most of the input, but also provide a guide to select a few portions of the input on which to apply the high-capacity sub-networks. The selection is made using a novel gradient-based attention mechanism, that efficiently identifies input regions for which the DCN’s output is most sensitive and to which we should devote more capacity. We focus our empirical evaluation on the Cluttered MNIST and SVHN image datasets. Our findings indicate that DCNs are able to drastically reduce the number of computations, compared to traditional convolutional neural networks, while maintaining similar or even better performance. ER -
APA
Almahairi, A., Ballas, N., Cooijmans, T., Zheng, Y., Larochelle, H. & Courville, A.. (2016). Dynamic Capacity Networks. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:2549-2558 Available from https://proceedings.mlr.press/v48/almahairi16.html.

Related Material