Freeze and Chaos: NTK views on DNN Normalization, Checkerboard and Boundary Artifacts

Arthur Jacot; Franck Gabriel; Francois Ged; Clement Hongler

Freeze and Chaos: NTK views on DNN Normalization, Checkerboard and Boundary Artifacts

Arthur Jacot, Franck Gabriel, Francois Ged, Clement Hongler

Proceedings of Mathematical and Scientific Machine Learning, PMLR 190:257-270, 2022.

Abstract

We analyze architectural features of Deep Neural Networks (DNNs) using the so-called Neural Tangent Kernel (NTK), which describes the training and generalization of DNNs in the infinite-width setting. In this setting, we show that for fully-connected DNNs, as the depth grows, two regimes appear: freeze (or order), where the (scaled) NTK converges to a constant, and chaos, where it converges to a Kronecker delta. Extreme freeze slows down training while extreme chaos hinders generalization. Using the scaled ReLU as a nonlinearity, we end up in the frozen regime. In contrast, Layer Normalization brings the network into the chaotic regime. We observe a similar effect for Batch Normalization (BN) applied after the last nonlinearity. We uncover the same freeze and chaos modes in Deep Deconvolutional Networks (DC-NNs). Our analysis explains the appearance of so-called checkerboard patterns and border artifacts. Moving the network into the chaotic regime prevents checkerboard patterns; we propose a graph-based parametrization which eliminates border artifacts; finally, we introduce a new layer-dependent learning rate to improve the convergence of DC-NNs. We illustrate our findings on DCGANs: the frozen regime leads to a collapse of the generator to a checkerboard mode, which can be avoided by tuning the nonlinearity to reach the chaotic regime. As a result, we are able to obtain good quality samples for DCGANs without BN.

Cite this Paper

BibTeX


@InProceedings{pmlr-v190-jacot22a,
  title = 	 {Freeze and Chaos: NTK views on DNN Normalization, Checkerboard and Boundary Artifacts},
  author =       {Jacot, Arthur and Gabriel, Franck and Ged, Francois and Hongler, Clement},
  booktitle = 	 {Proceedings of Mathematical and Scientific Machine Learning},
  pages = 	 {257--270},
  year = 	 {2022},
  editor = 	 {Dong, Bin and Li, Qianxiao and Wang, Lei and Xu, Zhi-Qin John},
  volume = 	 {190},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {15--17 Aug},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v190/jacot22a/jacot22a.pdf},
  url = 	 {https://proceedings.mlr.press/v190/jacot22a.html},
  abstract = 	 {We analyze architectural features of Deep Neural Networks (DNNs) using the so-called Neural Tangent Kernel (NTK), which describes the training and generalization of DNNs in the infinite-width setting. In this setting, we show that for fully-connected DNNs, as the depth grows, two regimes appear: freeze (or order), where the (scaled) NTK converges to a constant, and chaos, where it converges to a Kronecker delta. Extreme freeze slows down training while extreme chaos hinders generalization. Using the scaled ReLU as a nonlinearity, we end up in the frozen regime. In contrast, Layer Normalization brings the network into the chaotic regime. We observe a similar effect for Batch Normalization (BN) applied after the last nonlinearity. We uncover the same freeze and chaos modes in Deep Deconvolutional Networks (DC-NNs). Our analysis explains the appearance of so-called checkerboard patterns and border artifacts. Moving the network into the chaotic regime prevents checkerboard patterns; we propose a graph-based parametrization which eliminates border artifacts; finally, we introduce a new layer-dependent learning rate to improve the convergence of DC-NNs. We illustrate our findings on DCGANs: the frozen regime leads to a collapse of the generator to a checkerboard mode, which can be avoided by tuning the nonlinearity to reach the chaotic regime. As a result, we are able to obtain good quality samples for DCGANs without BN.}
}

Endnote

%0 Conference Paper
%T Freeze and Chaos: NTK views on DNN Normalization, Checkerboard and Boundary Artifacts
%A Arthur Jacot
%A Franck Gabriel
%A Francois Ged
%A Clement Hongler
%B Proceedings of Mathematical and Scientific Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Bin Dong
%E Qianxiao Li
%E Lei Wang
%E Zhi-Qin John Xu	
%F pmlr-v190-jacot22a
%I PMLR
%P 257--270
%U https://proceedings.mlr.press/v190/jacot22a.html
%V 190
%X We analyze architectural features of Deep Neural Networks (DNNs) using the so-called Neural Tangent Kernel (NTK), which describes the training and generalization of DNNs in the infinite-width setting. In this setting, we show that for fully-connected DNNs, as the depth grows, two regimes appear: freeze (or order), where the (scaled) NTK converges to a constant, and chaos, where it converges to a Kronecker delta. Extreme freeze slows down training while extreme chaos hinders generalization. Using the scaled ReLU as a nonlinearity, we end up in the frozen regime. In contrast, Layer Normalization brings the network into the chaotic regime. We observe a similar effect for Batch Normalization (BN) applied after the last nonlinearity. We uncover the same freeze and chaos modes in Deep Deconvolutional Networks (DC-NNs). Our analysis explains the appearance of so-called checkerboard patterns and border artifacts. Moving the network into the chaotic regime prevents checkerboard patterns; we propose a graph-based parametrization which eliminates border artifacts; finally, we introduce a new layer-dependent learning rate to improve the convergence of DC-NNs. We illustrate our findings on DCGANs: the frozen regime leads to a collapse of the generator to a checkerboard mode, which can be avoided by tuning the nonlinearity to reach the chaotic regime. As a result, we are able to obtain good quality samples for DCGANs without BN.

APA


Jacot, A., Gabriel, F., Ged, F. & Hongler, C.. (2022). Freeze and Chaos: NTK views on DNN Normalization, Checkerboard and Boundary Artifacts. Proceedings of Mathematical and Scientific Machine Learning, in Proceedings of Machine Learning Research 190:257-270 Available from https://proceedings.mlr.press/v190/jacot22a.html.

Related Material

Download PDF