On the emergence of simplex symmetry in the final and penultimate layers of neural network classifiers

Weinan E; Stephan Wojtowytsch

On the emergence of simplex symmetry in the final and penultimate layers of neural network classifiers

Weinan E, Stephan Wojtowytsch

Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference, PMLR 145:270-290, 2022.

Abstract

A recent numerical study observed that neural network classifiers enjoy a large degree of symmetry in the penultimate layer. Namely, if $h(x) = Af(x) +b$ where $A$ is a linear map and $f$ is the output of the penultimate layer of the network (after activation), then all data points $x_{i, 1}, …, x_{i, N_i}$ in a class $C_i$ are mapped to a single point $y_i$ by $f$ and the points $y_i$ are located at the vertices of a regular $k-1$-dimensional \sw{standard simplex} in a high-dimensional Euclidean space. We explain this observation analytically in toy models for highly expressive deep neural networks. In complementary examples, we demonstrate rigorously that even the final output of the classifier $h$ is not uniform over data samples from a class $C_i$ if $h$ is a shallow network (or if the deeper layers do not bring the data samples into a convenient geometric configuration).

Cite this Paper

BibTeX


@InProceedings{pmlr-v145-e22b,
  title = 	 {On the emergence of simplex symmetry in the final and penultimate layers of neural network classifiers},
  author =       {E, Weinan and Wojtowytsch, Stephan},
  booktitle = 	 {Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference},
  pages = 	 {270--290},
  year = 	 {2022},
  editor = 	 {Bruna, Joan and Hesthaven, Jan and Zdeborova, Lenka},
  volume = 	 {145},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {16--19 Aug},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v145/e22b/e22b.pdf},
  url = 	 {https://proceedings.mlr.press/v145/e22b.html},
  abstract = 	 {A recent numerical study observed that neural network classifiers enjoy a large degree of symmetry in the penultimate layer. Namely, if $h(x) = Af(x) +b$ where $A$ is a linear map and $f$ is the output of the penultimate layer of the network (after activation), then all data points $x_{i, 1}, …, x_{i, N_i}$ in a class $C_i$ are mapped to a single point $y_i$ by $f$ and the points $y_i$ are located at the vertices of a regular $k-1$-dimensional \sw{standard simplex} in a high-dimensional Euclidean space. We explain this observation analytically in toy models for highly expressive deep neural networks. In complementary examples, we demonstrate rigorously that even the final output of the classifier $h$ is not uniform over data samples from a class $C_i$ if $h$ is a shallow network (or if the deeper layers do not bring the data samples into a convenient geometric configuration). }
}

Endnote

%0 Conference Paper
%T On the emergence of simplex symmetry in the final and penultimate layers of neural network classifiers
%A Weinan E
%A Stephan Wojtowytsch
%B Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference
%C Proceedings of Machine Learning Research
%D 2022
%E Joan Bruna
%E Jan Hesthaven
%E Lenka Zdeborova	
%F pmlr-v145-e22b
%I PMLR
%P 270--290
%U https://proceedings.mlr.press/v145/e22b.html
%V 145
%X A recent numerical study observed that neural network classifiers enjoy a large degree of symmetry in the penultimate layer. Namely, if $h(x) = Af(x) +b$ where $A$ is a linear map and $f$ is the output of the penultimate layer of the network (after activation), then all data points $x_{i, 1}, …, x_{i, N_i}$ in a class $C_i$ are mapped to a single point $y_i$ by $f$ and the points $y_i$ are located at the vertices of a regular $k-1$-dimensional \sw{standard simplex} in a high-dimensional Euclidean space. We explain this observation analytically in toy models for highly expressive deep neural networks. In complementary examples, we demonstrate rigorously that even the final output of the classifier $h$ is not uniform over data samples from a class $C_i$ if $h$ is a shallow network (or if the deeper layers do not bring the data samples into a convenient geometric configuration).

APA


E, W. & Wojtowytsch, S.. (2022). On the emergence of simplex symmetry in the final and penultimate layers of neural network classifiers. Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference, in Proceedings of Machine Learning Research 145:270-290 Available from https://proceedings.mlr.press/v145/e22b.html.

Related Material

Download PDF