[edit]
Non-asymptotic approximations of Gaussian neural networks via second-order Poincaré inequalities
Proceedings of the 6th Symposium on Advances in Approximate Bayesian Inference, PMLR 253:45-78, 2024.
Abstract
There is a recent and growing literature on large-width asymptotic and non-asymptotic properties of deep Gaussian neural networks (NNs), namely NNs with weights initialized as Gaussian distributions. For a Gaussian NN of depth $L\geq1$ and width $n\geq1$, it is well-known that, as $n\rightarrow+\infty$, the NN’s output converges (in distribution) to a Gaussian process. Recently, some quantitative versions of this result, also known as quantitative central limit theorems (QCLTs), have been obtained, showing that the rate of convergence is $n^{-1}$, in the $2$-Wasserstein distance, and that such a rate is optimal. In this paper, we investigate the use of second-order Poincaré inequalities as an alternative approach to establish QCLTs for the NN’s output. Previous approaches consist of a careful analysis of the NN, by combining non-trivial probabilistic tools with ad-hoc techniques that rely on the recursive definition of the network, typically by means of an induction argument over the layers, and it is unclear if and how they still apply to other NN’s architectures. Instead, the use of second-order Poincaré inequalities rely only on the fact that the NN is a functional of a Gaussian process, reducing the problem of establishing QCLTs to the algebraic problem of computing the gradient and Hessian of the NN’s output, which still applies to other NN’s architectures. We show how our approach is effective in establishing QCLTs for the NN’s output, though it leads to suboptimal rates of convergence. We argue that such a worsening in the rates is peculiar to second-order Poincaré inequalities, and it should be interpreted as the "cost" for having a straightforward, and general, procedure for obtaining QCLTs.