Understanding Priors in Bayesian Neural Networks at the Unit Level
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:6458-6467, 2019.
We investigate deep Bayesian neural networks with Gaussian priors on the weights and a class of ReLU-like nonlinearities. Bayesian neural networks with Gaussian priors are well known to induce an L2, “weight decay”, regularization. Our results indicate a more intricate regularization effect at the level of the unit activations. Our main result establishes that the induced prior distribution on the units before and after activation becomes increasingly heavy-tailed with the depth of the layer. We show that first layer units are Gaussian, second layer units are sub-exponential, and units in deeper layers are characterized by sub-Weibull distributions. Our results provide new theoretical insight on deep Bayesian neural networks, which we corroborate with simulation experiments.