[edit]
Nearest Class-Center Simplification through Intermediate Layers
Proceedings of Topological, Algebraic, and Geometric Learning Workshops 2022, PMLR 196:37-47, 2022.
Abstract
Recent advances in neural network theory have introduced geometric properties that occur during training, past the Interpolation Threshold- where the training error reaches zero. We inquire into the phenomena coined \emph{Neural Collapse} in the intermediate layers of the network, and emphasize the innerworkings of Nearest Class-Center Mismatch inside a deepnet. We further show that these processes occur both in vision and language model architectures. Lastly, we propose a Stochastic Variability-Simplification Loss (SVSL) that encourages better geometrical features in intermediate layers, yielding improvements in both train metrics and generalization.