Being Bayesian about Categorical Probability

Taejong Joo, Uijung Chung, Min-Gwan Seo
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:4950-4961, 2020.

Abstract

Neural networks utilize the softmax as a building block in classification tasks, which contains an overconfidence problem and lacks an uncertainty representation ability. As a Bayesian alternative to the softmax, we consider a random variable of a categorical probability over class labels. In this framework, the prior distribution explicitly models the presumed noise inherent in the observed label, which provides consistent gains in generalization performance in multiple challenging tasks. The proposed method inherits advantages of Bayesian approaches that achieve better uncertainty estimation and model calibration. Our method can be implemented as a plug-and-play loss function with negligible computational overhead compared to the softmax with the cross-entropy loss function.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-joo20a, title = {Being {B}ayesian about Categorical Probability}, author = {Joo, Taejong and Chung, Uijung and Seo, Min-Gwan}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {4950--4961}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/joo20a/joo20a.pdf}, url = {https://proceedings.mlr.press/v119/joo20a.html}, abstract = {Neural networks utilize the softmax as a building block in classification tasks, which contains an overconfidence problem and lacks an uncertainty representation ability. As a Bayesian alternative to the softmax, we consider a random variable of a categorical probability over class labels. In this framework, the prior distribution explicitly models the presumed noise inherent in the observed label, which provides consistent gains in generalization performance in multiple challenging tasks. The proposed method inherits advantages of Bayesian approaches that achieve better uncertainty estimation and model calibration. Our method can be implemented as a plug-and-play loss function with negligible computational overhead compared to the softmax with the cross-entropy loss function.} }
Endnote
%0 Conference Paper %T Being Bayesian about Categorical Probability %A Taejong Joo %A Uijung Chung %A Min-Gwan Seo %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-joo20a %I PMLR %P 4950--4961 %U https://proceedings.mlr.press/v119/joo20a.html %V 119 %X Neural networks utilize the softmax as a building block in classification tasks, which contains an overconfidence problem and lacks an uncertainty representation ability. As a Bayesian alternative to the softmax, we consider a random variable of a categorical probability over class labels. In this framework, the prior distribution explicitly models the presumed noise inherent in the observed label, which provides consistent gains in generalization performance in multiple challenging tasks. The proposed method inherits advantages of Bayesian approaches that achieve better uncertainty estimation and model calibration. Our method can be implemented as a plug-and-play loss function with negligible computational overhead compared to the softmax with the cross-entropy loss function.
APA
Joo, T., Chung, U. & Seo, M.. (2020). Being Bayesian about Categorical Probability. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:4950-4961 Available from https://proceedings.mlr.press/v119/joo20a.html.

Related Material