How do infinite width bounded norm networks look in function space?

Pedro Savarese, Itay Evron, Daniel Soudry, Nathan Srebro
Proceedings of the Thirty-Second Conference on Learning Theory, PMLR 99:2667-2690, 2019.

Abstract

We consider the question of what functions can be captured by ReLU networks with an unbounded number of units (infinite width), but where the overall network Euclidean norm (sum of squares of all weights in the system, except for an unregularized bias term for each unit) is bounded; or equivalently what is the minimal norm required to approximate a given function. For functions $f:\mathbb R \rightarrow\mathbb R$ and a single hidden layer, we show that the minimal network norm for representing $f$ is $\max(\int \lvert f”(x) \rvert \mathrm{d} x, \lvert f’(-\infty) + f’(+\infty) \rvert)$, and hence the minimal norm fit for a sample is given by a linear spline interpolation.

Cite this Paper


BibTeX
@InProceedings{pmlr-v99-savarese19a, title = {How do infinite width bounded norm networks look in function space?}, author = {Savarese, Pedro and Evron, Itay and Soudry, Daniel and Srebro, Nathan}, booktitle = {Proceedings of the Thirty-Second Conference on Learning Theory}, pages = {2667--2690}, year = {2019}, editor = {Beygelzimer, Alina and Hsu, Daniel}, volume = {99}, series = {Proceedings of Machine Learning Research}, month = {25--28 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v99/savarese19a/savarese19a.pdf}, url = {https://proceedings.mlr.press/v99/savarese19a.html}, abstract = {We consider the question of what functions can be captured by ReLU networks with an unbounded number of units (infinite width), but where the overall network Euclidean norm (sum of squares of all weights in the system, except for an unregularized bias term for each unit) is bounded; or equivalently what is the minimal norm required to approximate a given function. For functions $f:\mathbb R \rightarrow\mathbb R$ and a single hidden layer, we show that the minimal network norm for representing $f$ is $\max(\int \lvert f”(x) \rvert \mathrm{d} x, \lvert f’(-\infty) + f’(+\infty) \rvert)$, and hence the minimal norm fit for a sample is given by a linear spline interpolation. } }
Endnote
%0 Conference Paper %T How do infinite width bounded norm networks look in function space? %A Pedro Savarese %A Itay Evron %A Daniel Soudry %A Nathan Srebro %B Proceedings of the Thirty-Second Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2019 %E Alina Beygelzimer %E Daniel Hsu %F pmlr-v99-savarese19a %I PMLR %P 2667--2690 %U https://proceedings.mlr.press/v99/savarese19a.html %V 99 %X We consider the question of what functions can be captured by ReLU networks with an unbounded number of units (infinite width), but where the overall network Euclidean norm (sum of squares of all weights in the system, except for an unregularized bias term for each unit) is bounded; or equivalently what is the minimal norm required to approximate a given function. For functions $f:\mathbb R \rightarrow\mathbb R$ and a single hidden layer, we show that the minimal network norm for representing $f$ is $\max(\int \lvert f”(x) \rvert \mathrm{d} x, \lvert f’(-\infty) + f’(+\infty) \rvert)$, and hence the minimal norm fit for a sample is given by a linear spline interpolation.
APA
Savarese, P., Evron, I., Soudry, D. & Srebro, N.. (2019). How do infinite width bounded norm networks look in function space?. Proceedings of the Thirty-Second Conference on Learning Theory, in Proceedings of Machine Learning Research 99:2667-2690 Available from https://proceedings.mlr.press/v99/savarese19a.html.

Related Material