Preventing Arbitrarily High Confidence on Far-Away Data in Point-Estimated Discriminative Neural Networks

Ahmad Rashid, Serena Hacker, Guojun Zhang, Agustinus Kristiadi, Pascal Poupart
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:3034-3042, 2024.

Abstract

Discriminatively trained, deterministic neural networks are the de facto choice for classification problems. However, even though they achieve state-of-the-art results on in-domain test sets, they tend to be overconfident on out-of-distribution (OOD) data. For instance, ReLU networks—a popular class of neural network architectures—have been shown to almost always yield high confidence predictions when the test data are far away from the training set, even when they are trained with OOD data. We overcome this problem by adding a term to the output of the neural network that corresponds to the logit of an extra class, that we design to dominate the logits of the original classes as we move away from the training data. This technique provably prevents arbitrarily high confidence on far-away test data while maintaining a simple discriminative point-estimate training. Evaluation on various benchmarks demonstrates strong performance against competitive baselines on both far-away and realistic OOD data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-rashid24a, title = {Preventing Arbitrarily High Confidence on Far-Away Data in Point-Estimated Discriminative Neural Networks}, author = {Rashid, Ahmad and Hacker, Serena and Zhang, Guojun and Kristiadi, Agustinus and Poupart, Pascal}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {3034--3042}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/rashid24a/rashid24a.pdf}, url = {https://proceedings.mlr.press/v238/rashid24a.html}, abstract = {Discriminatively trained, deterministic neural networks are the de facto choice for classification problems. However, even though they achieve state-of-the-art results on in-domain test sets, they tend to be overconfident on out-of-distribution (OOD) data. For instance, ReLU networks—a popular class of neural network architectures—have been shown to almost always yield high confidence predictions when the test data are far away from the training set, even when they are trained with OOD data. We overcome this problem by adding a term to the output of the neural network that corresponds to the logit of an extra class, that we design to dominate the logits of the original classes as we move away from the training data. This technique provably prevents arbitrarily high confidence on far-away test data while maintaining a simple discriminative point-estimate training. Evaluation on various benchmarks demonstrates strong performance against competitive baselines on both far-away and realistic OOD data.} }
Endnote
%0 Conference Paper %T Preventing Arbitrarily High Confidence on Far-Away Data in Point-Estimated Discriminative Neural Networks %A Ahmad Rashid %A Serena Hacker %A Guojun Zhang %A Agustinus Kristiadi %A Pascal Poupart %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-rashid24a %I PMLR %P 3034--3042 %U https://proceedings.mlr.press/v238/rashid24a.html %V 238 %X Discriminatively trained, deterministic neural networks are the de facto choice for classification problems. However, even though they achieve state-of-the-art results on in-domain test sets, they tend to be overconfident on out-of-distribution (OOD) data. For instance, ReLU networks—a popular class of neural network architectures—have been shown to almost always yield high confidence predictions when the test data are far away from the training set, even when they are trained with OOD data. We overcome this problem by adding a term to the output of the neural network that corresponds to the logit of an extra class, that we design to dominate the logits of the original classes as we move away from the training data. This technique provably prevents arbitrarily high confidence on far-away test data while maintaining a simple discriminative point-estimate training. Evaluation on various benchmarks demonstrates strong performance against competitive baselines on both far-away and realistic OOD data.
APA
Rashid, A., Hacker, S., Zhang, G., Kristiadi, A. & Poupart, P.. (2024). Preventing Arbitrarily High Confidence on Far-Away Data in Point-Estimated Discriminative Neural Networks. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:3034-3042 Available from https://proceedings.mlr.press/v238/rashid24a.html.

Related Material