Non-asymptotic approximations of neural networks by Gaussian processes

Ronen Eldan, Dan Mikulincer, Tselil Schramm
Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:1754-1775, 2021.

Abstract

We study the extent to which wide neural networks may be approximated by Gaussian processes, when initialized with random weights. It is a well-established fact that as the width of a network goes to infinity, its law converges to that of a Gaussian process. We make this quantitative by establishing explicit convergence rates for the central limit theorem in an infinite-dimensional functional space, metrized with a natural transportation distance. We identify two regimes of interest; when the activation function is polynomial, its degree determines the rate of convergence, while for non-polynomial activations, the rate is governed by the smoothness of the function.

Cite this Paper


BibTeX
@InProceedings{pmlr-v134-eldan21a, title = {Non-asymptotic approximations of neural networks by Gaussian processes}, author = {Eldan, Ronen and Mikulincer, Dan and Schramm, Tselil}, booktitle = {Proceedings of Thirty Fourth Conference on Learning Theory}, pages = {1754--1775}, year = {2021}, editor = {Belkin, Mikhail and Kpotufe, Samory}, volume = {134}, series = {Proceedings of Machine Learning Research}, month = {15--19 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v134/eldan21a/eldan21a.pdf}, url = {https://proceedings.mlr.press/v134/eldan21a.html}, abstract = {We study the extent to which wide neural networks may be approximated by Gaussian processes, when initialized with random weights. It is a well-established fact that as the width of a network goes to infinity, its law converges to that of a Gaussian process. We make this quantitative by establishing explicit convergence rates for the central limit theorem in an infinite-dimensional functional space, metrized with a natural transportation distance. We identify two regimes of interest; when the activation function is polynomial, its degree determines the rate of convergence, while for non-polynomial activations, the rate is governed by the smoothness of the function.} }
Endnote
%0 Conference Paper %T Non-asymptotic approximations of neural networks by Gaussian processes %A Ronen Eldan %A Dan Mikulincer %A Tselil Schramm %B Proceedings of Thirty Fourth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2021 %E Mikhail Belkin %E Samory Kpotufe %F pmlr-v134-eldan21a %I PMLR %P 1754--1775 %U https://proceedings.mlr.press/v134/eldan21a.html %V 134 %X We study the extent to which wide neural networks may be approximated by Gaussian processes, when initialized with random weights. It is a well-established fact that as the width of a network goes to infinity, its law converges to that of a Gaussian process. We make this quantitative by establishing explicit convergence rates for the central limit theorem in an infinite-dimensional functional space, metrized with a natural transportation distance. We identify two regimes of interest; when the activation function is polynomial, its degree determines the rate of convergence, while for non-polynomial activations, the rate is governed by the smoothness of the function.
APA
Eldan, R., Mikulincer, D. & Schramm, T.. (2021). Non-asymptotic approximations of neural networks by Gaussian processes. Proceedings of Thirty Fourth Conference on Learning Theory, in Proceedings of Machine Learning Research 134:1754-1775 Available from https://proceedings.mlr.press/v134/eldan21a.html.

Related Material