Hypernym Bias: Unraveling Deep Classifier Training Dynamics through the Lens of Class Hierarchy

Roman Malashin, Yachnaya Valeria, Alexandr V. Mullin
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:5131-5139, 2025.

Abstract

We investigate the training dynamics of deep classifiers by examining how hierarchical relationships between classes evolve during training. Through extensive experiments, we argue that the learning process in classification problems can be understood through the lens of label clustering. Specifically, we observe that networks tend to distinguish higher-level (hypernym) categories in the early stages of training, and learn more specific (hyponym) categories later. We introduce a novel framework to track the evolution of the feature manifold during training, revealing how the hierarchy of class relations emerges and refines across the network layers. Our analysis demonstrates that the learned representations closely align with the semantic structure of the dataset, providing a quantitative description of the clustering process. Notably, we show that in the hypernym label space, certain properties of neuronal collapse appear earlier than in the hyponym label space, helping to bridge the gap between the initial and terminal phases of learning. We believe our findings offer new insights into the mechanisms driving hierarchical learning in deep networks, paving the way for future advancements in understanding deep learning dynamics.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-malashin25a, title = {Hypernym Bias: Unraveling Deep Classifier Training Dynamics through the Lens of Class Hierarchy}, author = {Malashin, Roman and Valeria, Yachnaya and Mullin, Alexandr V.}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {5131--5139}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/malashin25a/malashin25a.pdf}, url = {https://proceedings.mlr.press/v258/malashin25a.html}, abstract = {We investigate the training dynamics of deep classifiers by examining how hierarchical relationships between classes evolve during training. Through extensive experiments, we argue that the learning process in classification problems can be understood through the lens of label clustering. Specifically, we observe that networks tend to distinguish higher-level (hypernym) categories in the early stages of training, and learn more specific (hyponym) categories later. We introduce a novel framework to track the evolution of the feature manifold during training, revealing how the hierarchy of class relations emerges and refines across the network layers. Our analysis demonstrates that the learned representations closely align with the semantic structure of the dataset, providing a quantitative description of the clustering process. Notably, we show that in the hypernym label space, certain properties of neuronal collapse appear earlier than in the hyponym label space, helping to bridge the gap between the initial and terminal phases of learning. We believe our findings offer new insights into the mechanisms driving hierarchical learning in deep networks, paving the way for future advancements in understanding deep learning dynamics.} }
Endnote
%0 Conference Paper %T Hypernym Bias: Unraveling Deep Classifier Training Dynamics through the Lens of Class Hierarchy %A Roman Malashin %A Yachnaya Valeria %A Alexandr V. Mullin %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-malashin25a %I PMLR %P 5131--5139 %U https://proceedings.mlr.press/v258/malashin25a.html %V 258 %X We investigate the training dynamics of deep classifiers by examining how hierarchical relationships between classes evolve during training. Through extensive experiments, we argue that the learning process in classification problems can be understood through the lens of label clustering. Specifically, we observe that networks tend to distinguish higher-level (hypernym) categories in the early stages of training, and learn more specific (hyponym) categories later. We introduce a novel framework to track the evolution of the feature manifold during training, revealing how the hierarchy of class relations emerges and refines across the network layers. Our analysis demonstrates that the learned representations closely align with the semantic structure of the dataset, providing a quantitative description of the clustering process. Notably, we show that in the hypernym label space, certain properties of neuronal collapse appear earlier than in the hyponym label space, helping to bridge the gap between the initial and terminal phases of learning. We believe our findings offer new insights into the mechanisms driving hierarchical learning in deep networks, paving the way for future advancements in understanding deep learning dynamics.
APA
Malashin, R., Valeria, Y. & Mullin, A.V.. (2025). Hypernym Bias: Unraveling Deep Classifier Training Dynamics through the Lens of Class Hierarchy. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:5131-5139 Available from https://proceedings.mlr.press/v258/malashin25a.html.

Related Material