The Power of Depth for Feedforward Neural Networks

Ronen Eldan; Ohad Shamir

The Power of Depth for Feedforward Neural Networks

Ronen Eldan, Ohad Shamir

29th Annual Conference on Learning Theory, PMLR 49:907-940, 2016.

Abstract

We show that there is a simple (approximately radial) function on \mathbbR^d, expressible by a small 3-layer feedforward neural networks, which cannot be approximated by any 2-layer network, to more than a certain constant accuracy, unless its width is exponential in the dimension. The result holds for virtually all known activation functions, including rectified linear units, sigmoids and thresholds, and formally demonstrates that depth – even if increased by 1 – can be exponentially more valuable than width for standard feedforward neural networks. Moreover, compared to related results in the context of Boolean functions, our result requires fewer assumptions, and the proof techniques and construction are very different.

Cite this Paper

BibTeX


@InProceedings{pmlr-v49-eldan16,
  title = 	 {The Power of Depth for Feedforward Neural Networks},
  author = 	 {Eldan, Ronen and Shamir, Ohad},
  booktitle = 	 {29th Annual Conference on Learning Theory},
  pages = 	 {907--940},
  year = 	 {2016},
  editor = 	 {Feldman, Vitaly and Rakhlin, Alexander and Shamir, Ohad},
  volume = 	 {49},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Columbia University, New York, New York, USA},
  month = 	 {23--26 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v49/eldan16.pdf},
  url = 	 {https://proceedings.mlr.press/v49/eldan16.html},
  abstract = 	 {We show that there is a simple (approximately radial) function on \mathbbR^d, expressible by a small 3-layer feedforward neural networks, which cannot be approximated by any 2-layer network, to more than a certain constant accuracy, unless its width is exponential in the dimension. The result holds for virtually all known activation functions, including rectified linear units, sigmoids and thresholds, and formally demonstrates that depth – even if increased by 1 – can be exponentially more valuable than width for standard feedforward neural networks. Moreover, compared to related results in the context of Boolean functions, our result requires fewer assumptions, and the proof techniques and construction are very different. }
}

Endnote

%0 Conference Paper
%T The Power of Depth for Feedforward Neural Networks
%A Ronen Eldan
%A Ohad Shamir
%B 29th Annual Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2016
%E Vitaly Feldman
%E Alexander Rakhlin
%E Ohad Shamir	
%F pmlr-v49-eldan16
%I PMLR
%P 907--940
%U https://proceedings.mlr.press/v49/eldan16.html
%V 49
%X We show that there is a simple (approximately radial) function on \mathbbR^d, expressible by a small 3-layer feedforward neural networks, which cannot be approximated by any 2-layer network, to more than a certain constant accuracy, unless its width is exponential in the dimension. The result holds for virtually all known activation functions, including rectified linear units, sigmoids and thresholds, and formally demonstrates that depth – even if increased by 1 – can be exponentially more valuable than width for standard feedforward neural networks. Moreover, compared to related results in the context of Boolean functions, our result requires fewer assumptions, and the proof techniques and construction are very different.

RIS


TY  - CPAPER
TI  - The Power of Depth for Feedforward Neural Networks
AU  - Ronen Eldan
AU  - Ohad Shamir
BT  - 29th Annual Conference on Learning Theory
DA  - 2016/06/06
ED  - Vitaly Feldman
ED  - Alexander Rakhlin
ED  - Ohad Shamir	
ID  - pmlr-v49-eldan16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 49
SP  - 907
EP  - 940
L1  - http://proceedings.mlr.press/v49/eldan16.pdf
UR  - https://proceedings.mlr.press/v49/eldan16.html
AB  - We show that there is a simple (approximately radial) function on \mathbbR^d, expressible by a small 3-layer feedforward neural networks, which cannot be approximated by any 2-layer network, to more than a certain constant accuracy, unless its width is exponential in the dimension. The result holds for virtually all known activation functions, including rectified linear units, sigmoids and thresholds, and formally demonstrates that depth – even if increased by 1 – can be exponentially more valuable than width for standard feedforward neural networks. Moreover, compared to related results in the context of Boolean functions, our result requires fewer assumptions, and the proof techniques and construction are very different. 
ER  -

APA


Eldan, R. & Shamir, O.. (2016). The Power of Depth for Feedforward Neural Networks. 29th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 49:907-940 Available from https://proceedings.mlr.press/v49/eldan16.html.

Related Material

Download PDF