How Does Loss Function Affect Generalization Performance of Deep Learning? Application to Human Age Estimation

Ali Akbari, Muhammad Awais, Manijeh Bashar, Josef Kittler
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:141-151, 2021.

Abstract

Good generalization performance across a wide variety of domains caused by many external and internal factors is the fundamental goal of any machine learning algorithm. This paper theoretically proves that the choice of loss function matters for improving the generalization performance of deep learning-based systems. By deriving the generalization error bound for deep neural models trained by stochastic gradient descent, we pinpoint the characteristics of the loss function that is linked to the generalization error and can therefore be used for guiding the loss function selection process. In summary, our main statement in this paper is: choose a stable loss function, generalize better. Focusing on human age estimation from the face which is a challenging topic in computer vision, we then propose a novel loss function for this learning problem. We theoretically prove that the proposed loss function achieves stronger stability, and consequently a tighter generalization error bound, compared to the other common loss functions for this problem. We have supported our findings theoretically, and demonstrated the merits of the guidance process experimentally, achieving significant improvements.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-akbari21a, title = {How Does Loss Function Affect Generalization Performance of Deep Learning? Application to Human Age Estimation}, author = {Akbari, Ali and Awais, Muhammad and Bashar, Manijeh and Kittler, Josef}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {141--151}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/akbari21a/akbari21a.pdf}, url = {https://proceedings.mlr.press/v139/akbari21a.html}, abstract = {Good generalization performance across a wide variety of domains caused by many external and internal factors is the fundamental goal of any machine learning algorithm. This paper theoretically proves that the choice of loss function matters for improving the generalization performance of deep learning-based systems. By deriving the generalization error bound for deep neural models trained by stochastic gradient descent, we pinpoint the characteristics of the loss function that is linked to the generalization error and can therefore be used for guiding the loss function selection process. In summary, our main statement in this paper is: choose a stable loss function, generalize better. Focusing on human age estimation from the face which is a challenging topic in computer vision, we then propose a novel loss function for this learning problem. We theoretically prove that the proposed loss function achieves stronger stability, and consequently a tighter generalization error bound, compared to the other common loss functions for this problem. We have supported our findings theoretically, and demonstrated the merits of the guidance process experimentally, achieving significant improvements.} }
Endnote
%0 Conference Paper %T How Does Loss Function Affect Generalization Performance of Deep Learning? Application to Human Age Estimation %A Ali Akbari %A Muhammad Awais %A Manijeh Bashar %A Josef Kittler %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-akbari21a %I PMLR %P 141--151 %U https://proceedings.mlr.press/v139/akbari21a.html %V 139 %X Good generalization performance across a wide variety of domains caused by many external and internal factors is the fundamental goal of any machine learning algorithm. This paper theoretically proves that the choice of loss function matters for improving the generalization performance of deep learning-based systems. By deriving the generalization error bound for deep neural models trained by stochastic gradient descent, we pinpoint the characteristics of the loss function that is linked to the generalization error and can therefore be used for guiding the loss function selection process. In summary, our main statement in this paper is: choose a stable loss function, generalize better. Focusing on human age estimation from the face which is a challenging topic in computer vision, we then propose a novel loss function for this learning problem. We theoretically prove that the proposed loss function achieves stronger stability, and consequently a tighter generalization error bound, compared to the other common loss functions for this problem. We have supported our findings theoretically, and demonstrated the merits of the guidance process experimentally, achieving significant improvements.
APA
Akbari, A., Awais, M., Bashar, M. & Kittler, J.. (2021). How Does Loss Function Affect Generalization Performance of Deep Learning? Application to Human Age Estimation. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:141-151 Available from https://proceedings.mlr.press/v139/akbari21a.html.

Related Material