On the Implicit Geometry of Cross-Entropy Parameterizations for Label-Imbalanced Data

Tina Behnia, Ganesh Ramachandra Kini, Vala Vakilian, Christos Thrampoulidis
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:10815-10838, 2023.

Abstract

Various logit-adjusted parameterizations of the cross-entropy (CE) loss have been proposed as alternatives to weighted CE for training large models on label-imbalanced data far beyond the zero train error regime. The driving force behind those designs has been the theory of implicit bias, which for linear(ized) models, explains why they successfully induce bias on the optimization path towards solutions that favor minorities. Aiming to extend this theory to non-linear models, we investigate the implicit geometry of classifiers and embeddings that are learned by different CE parameterizations. Our main result characterizes the global minimizers of a non-convex cost-sensitive SVM classifier for the unconstrained features model, which serves as an abstraction of deep-nets. We derive closed-form formulas for the angles and norms of classifiers and embeddings as a function of the number of classes, the imbalance and the minority ratios, and the loss hyperparameters. Using these, we show that logit-adjusted parameterizations can be appropriately tuned to learn symmetric geometries irrespective of the imbalance ratio. We complement our analysis with experiments and an empirical study of convergence accuracy in deep-nets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v206-behnia23a, title = {On the Implicit Geometry of Cross-Entropy Parameterizations for Label-Imbalanced Data}, author = {Behnia, Tina and Ramachandra Kini, Ganesh and Vakilian, Vala and Thrampoulidis, Christos}, booktitle = {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics}, pages = {10815--10838}, year = {2023}, editor = {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem}, volume = {206}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v206/behnia23a/behnia23a.pdf}, url = {https://proceedings.mlr.press/v206/behnia23a.html}, abstract = {Various logit-adjusted parameterizations of the cross-entropy (CE) loss have been proposed as alternatives to weighted CE for training large models on label-imbalanced data far beyond the zero train error regime. The driving force behind those designs has been the theory of implicit bias, which for linear(ized) models, explains why they successfully induce bias on the optimization path towards solutions that favor minorities. Aiming to extend this theory to non-linear models, we investigate the implicit geometry of classifiers and embeddings that are learned by different CE parameterizations. Our main result characterizes the global minimizers of a non-convex cost-sensitive SVM classifier for the unconstrained features model, which serves as an abstraction of deep-nets. We derive closed-form formulas for the angles and norms of classifiers and embeddings as a function of the number of classes, the imbalance and the minority ratios, and the loss hyperparameters. Using these, we show that logit-adjusted parameterizations can be appropriately tuned to learn symmetric geometries irrespective of the imbalance ratio. We complement our analysis with experiments and an empirical study of convergence accuracy in deep-nets.} }
Endnote
%0 Conference Paper %T On the Implicit Geometry of Cross-Entropy Parameterizations for Label-Imbalanced Data %A Tina Behnia %A Ganesh Ramachandra Kini %A Vala Vakilian %A Christos Thrampoulidis %B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2023 %E Francisco Ruiz %E Jennifer Dy %E Jan-Willem van de Meent %F pmlr-v206-behnia23a %I PMLR %P 10815--10838 %U https://proceedings.mlr.press/v206/behnia23a.html %V 206 %X Various logit-adjusted parameterizations of the cross-entropy (CE) loss have been proposed as alternatives to weighted CE for training large models on label-imbalanced data far beyond the zero train error regime. The driving force behind those designs has been the theory of implicit bias, which for linear(ized) models, explains why they successfully induce bias on the optimization path towards solutions that favor minorities. Aiming to extend this theory to non-linear models, we investigate the implicit geometry of classifiers and embeddings that are learned by different CE parameterizations. Our main result characterizes the global minimizers of a non-convex cost-sensitive SVM classifier for the unconstrained features model, which serves as an abstraction of deep-nets. We derive closed-form formulas for the angles and norms of classifiers and embeddings as a function of the number of classes, the imbalance and the minority ratios, and the loss hyperparameters. Using these, we show that logit-adjusted parameterizations can be appropriately tuned to learn symmetric geometries irrespective of the imbalance ratio. We complement our analysis with experiments and an empirical study of convergence accuracy in deep-nets.
APA
Behnia, T., Ramachandra Kini, G., Vakilian, V. & Thrampoulidis, C.. (2023). On the Implicit Geometry of Cross-Entropy Parameterizations for Label-Imbalanced Data. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:10815-10838 Available from https://proceedings.mlr.press/v206/behnia23a.html.

Related Material