Nearest Class-Center Simplification through Intermediate Layers

Ido Ben-Shaul, Shai Dekel
Proceedings of Topological, Algebraic, and Geometric Learning Workshops 2022, PMLR 196:37-47, 2022.

Abstract

Recent advances in neural network theory have introduced geometric properties that occur during training, past the Interpolation Threshold- where the training error reaches zero. We inquire into the phenomena coined \emph{Neural Collapse} in the intermediate layers of the network, and emphasize the innerworkings of Nearest Class-Center Mismatch inside a deepnet. We further show that these processes occur both in vision and language model architectures. Lastly, we propose a Stochastic Variability-Simplification Loss (SVSL) that encourages better geometrical features in intermediate layers, yielding improvements in both train metrics and generalization.

Cite this Paper


BibTeX
@InProceedings{pmlr-v196-ben-shaul22a, title = {Nearest Class-Center Simplification through Intermediate Layers}, author = {Ben-Shaul, Ido and Dekel, Shai}, booktitle = {Proceedings of Topological, Algebraic, and Geometric Learning Workshops 2022}, pages = {37--47}, year = {2022}, editor = {Cloninger, Alexander and Doster, Timothy and Emerson, Tegan and Kaul, Manohar and Ktena, Ira and Kvinge, Henry and Miolane, Nina and Rieck, Bastian and Tymochko, Sarah and Wolf, Guy}, volume = {196}, series = {Proceedings of Machine Learning Research}, month = {25 Feb--22 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v196/ben-shaul22a/ben-shaul22a.pdf}, url = {https://proceedings.mlr.press/v196/ben-shaul22a.html}, abstract = {Recent advances in neural network theory have introduced geometric properties that occur during training, past the Interpolation Threshold- where the training error reaches zero. We inquire into the phenomena coined \emph{Neural Collapse} in the intermediate layers of the network, and emphasize the innerworkings of Nearest Class-Center Mismatch inside a deepnet. We further show that these processes occur both in vision and language model architectures. Lastly, we propose a Stochastic Variability-Simplification Loss (SVSL) that encourages better geometrical features in intermediate layers, yielding improvements in both train metrics and generalization.} }
Endnote
%0 Conference Paper %T Nearest Class-Center Simplification through Intermediate Layers %A Ido Ben-Shaul %A Shai Dekel %B Proceedings of Topological, Algebraic, and Geometric Learning Workshops 2022 %C Proceedings of Machine Learning Research %D 2022 %E Alexander Cloninger %E Timothy Doster %E Tegan Emerson %E Manohar Kaul %E Ira Ktena %E Henry Kvinge %E Nina Miolane %E Bastian Rieck %E Sarah Tymochko %E Guy Wolf %F pmlr-v196-ben-shaul22a %I PMLR %P 37--47 %U https://proceedings.mlr.press/v196/ben-shaul22a.html %V 196 %X Recent advances in neural network theory have introduced geometric properties that occur during training, past the Interpolation Threshold- where the training error reaches zero. We inquire into the phenomena coined \emph{Neural Collapse} in the intermediate layers of the network, and emphasize the innerworkings of Nearest Class-Center Mismatch inside a deepnet. We further show that these processes occur both in vision and language model architectures. Lastly, we propose a Stochastic Variability-Simplification Loss (SVSL) that encourages better geometrical features in intermediate layers, yielding improvements in both train metrics and generalization.
APA
Ben-Shaul, I. & Dekel, S.. (2022). Nearest Class-Center Simplification through Intermediate Layers. Proceedings of Topological, Algebraic, and Geometric Learning Workshops 2022, in Proceedings of Machine Learning Research 196:37-47 Available from https://proceedings.mlr.press/v196/ben-shaul22a.html.

Related Material