Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Features Model

Hien Dang; Tho Tran Huu; Tan Minh Nguyen; Nhat Ho

Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Features Model

Hien Dang, Tho Tran Huu, Tan Minh Nguyen, Nhat Ho

Proceedings of the 41st International Conference on Machine Learning, PMLR 235:10017-10040, 2024.

Abstract

The current paradigm of training deep neural networks for classification tasks includes minimizing the empirical risk, pushing the training loss value towards zero even after the training classification error has vanished. In this terminal phase of training, it has been observed that the last-layer features collapse to their class-means and these class-means converge to the vertices of a simplex Equiangular Tight Frame (ETF). This phenomenon is termed as Neural Collapse ($\mathcal{NC}$). However, this characterization only holds in class-balanced datasets where every class has the same number of training samples. When the training dataset is class-imbalanced, some $\mathcal{NC}$ properties will no longer hold true, for example, the geometry of class-means will skew away from the simplex ETF. In this paper, we generalize $\mathcal{NC}$ to imbalanced regime for cross-entropy loss under the unconstrained ReLU features model. We demonstrate that while the within-class features collapse property still holds in this setting, the class-means will converge to a structure consisting of orthogonal vectors with lengths dependent on the number of training samples. Furthermore, we find that the classifier weights (i.e., the last-layer linear classifier) are aligned to the scaled and centered class-means, with scaling factors dependent on the number of training samples of each class. This generalizes $\mathcal{NC}$ in the class-balanced setting. We empirically validate our results through experiments on practical architectures and dataset.

Cite this Paper

BibTeX

@InProceedings{pmlr-v235-dang24a,
  title = 	 {Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained {R}e{LU} Features Model},
  author =       {Dang, Hien and Huu, Tho Tran and Nguyen, Tan Minh and Ho, Nhat},
  booktitle = 	 {Proceedings of the 41st International Conference on Machine Learning},
  pages = 	 {10017--10040},
  year = 	 {2024},
  editor = 	 {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix},
  volume = 	 {235},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {21--27 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v235/main/assets/dang24a/dang24a.pdf},
  url = 	 {https://proceedings.mlr.press/v235/dang24a.html},
  abstract = 	 {The current paradigm of training deep neural networks for classification tasks includes minimizing the empirical risk, pushing the training loss value towards zero even after the training classification error has vanished. In this terminal phase of training, it has been observed that the last-layer features collapse to their class-means and these class-means converge to the vertices of a simplex Equiangular Tight Frame (ETF). This phenomenon is termed as Neural Collapse ($\mathcal{NC}$). However, this characterization only holds in class-balanced datasets where every class has the same number of training samples. When the training dataset is class-imbalanced, some $\mathcal{NC}$ properties will no longer hold true, for example, the geometry of class-means will skew away from the simplex ETF. In this paper, we generalize $\mathcal{NC}$ to imbalanced regime for cross-entropy loss under the unconstrained ReLU features model. We demonstrate that while the within-class features collapse property still holds in this setting, the class-means will converge to a structure consisting of orthogonal vectors with lengths dependent on the number of training samples. Furthermore, we find that the classifier weights (i.e., the last-layer linear classifier) are aligned to the scaled and centered class-means, with scaling factors dependent on the number of training samples of each class. This generalizes $\mathcal{NC}$ in the class-balanced setting. We empirically validate our results through experiments on practical architectures and dataset.}
}

Endnote

%0 Conference Paper
%T Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Features Model
%A Hien Dang
%A Tho Tran Huu
%A Tan Minh Nguyen
%A Nhat Ho
%B Proceedings of the 41st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2024
%E Ruslan Salakhutdinov
%E Zico Kolter
%E Katherine Heller
%E Adrian Weller
%E Nuria Oliver
%E Jonathan Scarlett
%E Felix Berkenkamp	
%F pmlr-v235-dang24a
%I PMLR
%P 10017--10040
%U https://proceedings.mlr.press/v235/dang24a.html
%V 235
%X The current paradigm of training deep neural networks for classification tasks includes minimizing the empirical risk, pushing the training loss value towards zero even after the training classification error has vanished. In this terminal phase of training, it has been observed that the last-layer features collapse to their class-means and these class-means converge to the vertices of a simplex Equiangular Tight Frame (ETF). This phenomenon is termed as Neural Collapse ($\mathcal{NC}$). However, this characterization only holds in class-balanced datasets where every class has the same number of training samples. When the training dataset is class-imbalanced, some $\mathcal{NC}$ properties will no longer hold true, for example, the geometry of class-means will skew away from the simplex ETF. In this paper, we generalize $\mathcal{NC}$ to imbalanced regime for cross-entropy loss under the unconstrained ReLU features model. We demonstrate that while the within-class features collapse property still holds in this setting, the class-means will converge to a structure consisting of orthogonal vectors with lengths dependent on the number of training samples. Furthermore, we find that the classifier weights (i.e., the last-layer linear classifier) are aligned to the scaled and centered class-means, with scaling factors dependent on the number of training samples of each class. This generalizes $\mathcal{NC}$ in the class-balanced setting. We empirically validate our results through experiments on practical architectures and dataset.

APA

Dang, H., Huu, T.T., Nguyen, T.M. & Ho, N.. (2024). Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Features Model. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:10017-10040 Available from https://proceedings.mlr.press/v235/dang24a.html.

Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Features Model

Abstract

Cite this Paper

Related Material