Initial Guessing Bias: How Untrained Networks Favor Some Classes

Emanuele Francazi, Aurelien Lucchi, Marco Baity-Jesi
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:13783-13839, 2024.

Abstract

Understanding and controlling biasing effects in neural networks is crucial for ensuring accurate and fair model performance. In the context of classification problems, we provide a theoretical analysis demonstrating that the structure of a deep neural network (DNN) can condition the model to assign all predictions to the same class, even before the beginning of training, and in the absence of explicit biases. We prove that, besides dataset properties, the presence of this phenomenon, which we call Initial Guessing Bias (IGB), is influenced by model choices including dataset preprocessing methods, and architectural decisions, such as activation functions, max-pooling layers, and network depth. Our analysis of IGB provides information for architecture selection and model initialization. We also highlight theoretical consequences, such as the breakdown of node-permutation symmetry, the violation of self-averaging and the non-trivial effects that depth has on the phenomenon.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-francazi24a, title = {Initial Guessing Bias: How Untrained Networks Favor Some Classes}, author = {Francazi, Emanuele and Lucchi, Aurelien and Baity-Jesi, Marco}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {13783--13839}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/francazi24a/francazi24a.pdf}, url = {https://proceedings.mlr.press/v235/francazi24a.html}, abstract = {Understanding and controlling biasing effects in neural networks is crucial for ensuring accurate and fair model performance. In the context of classification problems, we provide a theoretical analysis demonstrating that the structure of a deep neural network (DNN) can condition the model to assign all predictions to the same class, even before the beginning of training, and in the absence of explicit biases. We prove that, besides dataset properties, the presence of this phenomenon, which we call Initial Guessing Bias (IGB), is influenced by model choices including dataset preprocessing methods, and architectural decisions, such as activation functions, max-pooling layers, and network depth. Our analysis of IGB provides information for architecture selection and model initialization. We also highlight theoretical consequences, such as the breakdown of node-permutation symmetry, the violation of self-averaging and the non-trivial effects that depth has on the phenomenon.} }
Endnote
%0 Conference Paper %T Initial Guessing Bias: How Untrained Networks Favor Some Classes %A Emanuele Francazi %A Aurelien Lucchi %A Marco Baity-Jesi %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-francazi24a %I PMLR %P 13783--13839 %U https://proceedings.mlr.press/v235/francazi24a.html %V 235 %X Understanding and controlling biasing effects in neural networks is crucial for ensuring accurate and fair model performance. In the context of classification problems, we provide a theoretical analysis demonstrating that the structure of a deep neural network (DNN) can condition the model to assign all predictions to the same class, even before the beginning of training, and in the absence of explicit biases. We prove that, besides dataset properties, the presence of this phenomenon, which we call Initial Guessing Bias (IGB), is influenced by model choices including dataset preprocessing methods, and architectural decisions, such as activation functions, max-pooling layers, and network depth. Our analysis of IGB provides information for architecture selection and model initialization. We also highlight theoretical consequences, such as the breakdown of node-permutation symmetry, the violation of self-averaging and the non-trivial effects that depth has on the phenomenon.
APA
Francazi, E., Lucchi, A. & Baity-Jesi, M.. (2024). Initial Guessing Bias: How Untrained Networks Favor Some Classes. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:13783-13839 Available from https://proceedings.mlr.press/v235/francazi24a.html.

Related Material