Nonlinear spiked covariance matrices and signal propagation in deep neural networks

Zhichao Wang, Denny Wu, Zhou Fan
Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:4891-4957, 2024.

Abstract

Many recent works have studied the eigenvalue spectrum of the Conjugate Kernel (CK) defined by the nonlinear feature map of a feedforward neural network. However, existing results only establish weak convergence of the empirical eigenvalue distribution, and fall short of providing precise quantitative characterizations of the “spike” eigenvalues and eigenvectors that often capture the low-dimensional signal structure of the learning problem. In this work, we characterize these signal eigenvalues and eigenvectors for a nonlinear version of the spiked covariance model, including the CK as a special case. Using this general result, we give a quantitative description of how spiked eigenstructure in the input data propagates through the hidden layers of a neural network with random weights. As a second application, we study a simple regime of representation learning where the weight matrix develops a rank-one signal component over training and characterize the alignment of the target function with the spike eigenvector of the CK on test data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v247-wang24b, title = {Nonlinear spiked covariance matrices and signal propagation in deep neural networks}, author = {Wang, Zhichao and Wu, Denny and Fan, Zhou}, booktitle = {Proceedings of Thirty Seventh Conference on Learning Theory}, pages = {4891--4957}, year = {2024}, editor = {Agrawal, Shipra and Roth, Aaron}, volume = {247}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v247/wang24b/wang24b.pdf}, url = {https://proceedings.mlr.press/v247/wang24b.html}, abstract = {Many recent works have studied the eigenvalue spectrum of the Conjugate Kernel (CK) defined by the nonlinear feature map of a feedforward neural network. However, existing results only establish weak convergence of the empirical eigenvalue distribution, and fall short of providing precise quantitative characterizations of the “spike” eigenvalues and eigenvectors that often capture the low-dimensional signal structure of the learning problem. In this work, we characterize these signal eigenvalues and eigenvectors for a nonlinear version of the spiked covariance model, including the CK as a special case. Using this general result, we give a quantitative description of how spiked eigenstructure in the input data propagates through the hidden layers of a neural network with random weights. As a second application, we study a simple regime of representation learning where the weight matrix develops a rank-one signal component over training and characterize the alignment of the target function with the spike eigenvector of the CK on test data.} }
Endnote
%0 Conference Paper %T Nonlinear spiked covariance matrices and signal propagation in deep neural networks %A Zhichao Wang %A Denny Wu %A Zhou Fan %B Proceedings of Thirty Seventh Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2024 %E Shipra Agrawal %E Aaron Roth %F pmlr-v247-wang24b %I PMLR %P 4891--4957 %U https://proceedings.mlr.press/v247/wang24b.html %V 247 %X Many recent works have studied the eigenvalue spectrum of the Conjugate Kernel (CK) defined by the nonlinear feature map of a feedforward neural network. However, existing results only establish weak convergence of the empirical eigenvalue distribution, and fall short of providing precise quantitative characterizations of the “spike” eigenvalues and eigenvectors that often capture the low-dimensional signal structure of the learning problem. In this work, we characterize these signal eigenvalues and eigenvectors for a nonlinear version of the spiked covariance model, including the CK as a special case. Using this general result, we give a quantitative description of how spiked eigenstructure in the input data propagates through the hidden layers of a neural network with random weights. As a second application, we study a simple regime of representation learning where the weight matrix develops a rank-one signal component over training and characterize the alignment of the target function with the spike eigenvector of the CK on test data.
APA
Wang, Z., Wu, D. & Fan, Z.. (2024). Nonlinear spiked covariance matrices and signal propagation in deep neural networks. Proceedings of Thirty Seventh Conference on Learning Theory, in Proceedings of Machine Learning Research 247:4891-4957 Available from https://proceedings.mlr.press/v247/wang24b.html.

Related Material