SafetyCage: A misclassification detector for feed-forward neural networks

Pål Vegard Johnsen, Filippo Remonato
Proceedings of the 5th Northern Lights Deep Learning Conference ({NLDL}), PMLR 233:113-119, 2024.

Abstract

Deep learning classifiers have reached state-of-the-art performance in many fields, particularly so image classification. Wrong class assignment by the classifiers can often be inconsequential when distinguishing pictures of cats and dogs, but in more critical operations like autonomous driving vehicles or process control in industry, wrong classifications can lead to disastrous events. While reducing the error rate of the classifier is of primary importance, it is impossible to completely remove it. Having a system that is able to flag wrong or suspicious classifications is therefore a necessary component for safety and robustness in operations. In this work, we present a general statistical inference framework for detection of misclassifications. We test our approach on two well-known benchmark datasets: MNIST and CIFAR-10. We show that, given the underlying classifier is well trained, SafetyCage is effective at flagging wrong classifications. We also include a detailed discussion of the drawbacks, and what can be done to improve the approach.

Cite this Paper


BibTeX
@InProceedings{pmlr-v233-johnsen24a, title = {SafetyCage: A misclassification detector for feed-forward neural networks}, author = {Johnsen, P{\r{a}}l Vegard and Remonato, Filippo}, booktitle = {Proceedings of the 5th Northern Lights Deep Learning Conference ({NLDL})}, pages = {113--119}, year = {2024}, editor = {Lutchyn, Tetiana and Ramírez Rivera, Adín and Ricaud, Benjamin}, volume = {233}, series = {Proceedings of Machine Learning Research}, month = {09--11 Jan}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v233/johnsen24a/johnsen24a.pdf}, url = {https://proceedings.mlr.press/v233/johnsen24a.html}, abstract = {Deep learning classifiers have reached state-of-the-art performance in many fields, particularly so image classification. Wrong class assignment by the classifiers can often be inconsequential when distinguishing pictures of cats and dogs, but in more critical operations like autonomous driving vehicles or process control in industry, wrong classifications can lead to disastrous events. While reducing the error rate of the classifier is of primary importance, it is impossible to completely remove it. Having a system that is able to flag wrong or suspicious classifications is therefore a necessary component for safety and robustness in operations. In this work, we present a general statistical inference framework for detection of misclassifications. We test our approach on two well-known benchmark datasets: MNIST and CIFAR-10. We show that, given the underlying classifier is well trained, SafetyCage is effective at flagging wrong classifications. We also include a detailed discussion of the drawbacks, and what can be done to improve the approach.} }
Endnote
%0 Conference Paper %T SafetyCage: A misclassification detector for feed-forward neural networks %A Pål Vegard Johnsen %A Filippo Remonato %B Proceedings of the 5th Northern Lights Deep Learning Conference ({NLDL}) %C Proceedings of Machine Learning Research %D 2024 %E Tetiana Lutchyn %E Adín Ramírez Rivera %E Benjamin Ricaud %F pmlr-v233-johnsen24a %I PMLR %P 113--119 %U https://proceedings.mlr.press/v233/johnsen24a.html %V 233 %X Deep learning classifiers have reached state-of-the-art performance in many fields, particularly so image classification. Wrong class assignment by the classifiers can often be inconsequential when distinguishing pictures of cats and dogs, but in more critical operations like autonomous driving vehicles or process control in industry, wrong classifications can lead to disastrous events. While reducing the error rate of the classifier is of primary importance, it is impossible to completely remove it. Having a system that is able to flag wrong or suspicious classifications is therefore a necessary component for safety and robustness in operations. In this work, we present a general statistical inference framework for detection of misclassifications. We test our approach on two well-known benchmark datasets: MNIST and CIFAR-10. We show that, given the underlying classifier is well trained, SafetyCage is effective at flagging wrong classifications. We also include a detailed discussion of the drawbacks, and what can be done to improve the approach.
APA
Johnsen, P.V. & Remonato, F.. (2024). SafetyCage: A misclassification detector for feed-forward neural networks. Proceedings of the 5th Northern Lights Deep Learning Conference ({NLDL}), in Proceedings of Machine Learning Research 233:113-119 Available from https://proceedings.mlr.press/v233/johnsen24a.html.

Related Material