Detecting Whisper Hallucinations with Local Confidence Contrasts

Sam Corpataux, Anna Scius-Bertrand, Beat Wolf
Proceedings of the Fourth Swiss AI Days, PMLR 309:38-45, 2026.

Abstract

Automatic speech recognition has advanced significantly with models like Whisper, yet confident hallucinations remain a critical challenge. In this work, we propose a lightweight and interpretable error detection framework that augments acoustic confidence with explicit contextual features. We introduce the Local Confidence Drop, a novel metric designed to capture sudden stability dips between neighboring tokens. Evaluated on the FLEURS dataset, our fandom forest classifier achieves 0.64 AP, consistently outperforming the baseline (p < 0.001). Crucially, we demonstrate that hallucinations manifest as local contextual discontinuities, providing a transparent alternative to opaque neural post-processors.

Cite this Paper


BibTeX
@InProceedings{pmlr-v309-corpataux26a, title = {Detecting Whisper Hallucinations with Local Confidence Contrasts}, author = {Corpataux, Sam and Scius-Bertrand, Anna and Wolf, Beat}, booktitle = {Proceedings of the Fourth Swiss AI Days}, pages = {38--45}, year = {2026}, editor = {Kucharavy, Andrei and Delgado, Pamela and Schürch Todeschini, Valérie and Rumley, Sébastien}, volume = {309}, series = {Proceedings of Machine Learning Research}, month = {23--25 Mar}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v309/main/assets/corpataux26a/corpataux26a.pdf}, url = {https://proceedings.mlr.press/v309/corpataux26a.html}, abstract = {Automatic speech recognition has advanced significantly with models like Whisper, yet confident hallucinations remain a critical challenge. In this work, we propose a lightweight and interpretable error detection framework that augments acoustic confidence with explicit contextual features. We introduce the Local Confidence Drop, a novel metric designed to capture sudden stability dips between neighboring tokens. Evaluated on the FLEURS dataset, our fandom forest classifier achieves 0.64 AP, consistently outperforming the baseline (p < 0.001). Crucially, we demonstrate that hallucinations manifest as local contextual discontinuities, providing a transparent alternative to opaque neural post-processors.} }
Endnote
%0 Conference Paper %T Detecting Whisper Hallucinations with Local Confidence Contrasts %A Sam Corpataux %A Anna Scius-Bertrand %A Beat Wolf %B Proceedings of the Fourth Swiss AI Days %C Proceedings of Machine Learning Research %D 2026 %E Andrei Kucharavy %E Pamela Delgado %E Valérie Schürch Todeschini %E Sébastien Rumley %F pmlr-v309-corpataux26a %I PMLR %P 38--45 %U https://proceedings.mlr.press/v309/corpataux26a.html %V 309 %X Automatic speech recognition has advanced significantly with models like Whisper, yet confident hallucinations remain a critical challenge. In this work, we propose a lightweight and interpretable error detection framework that augments acoustic confidence with explicit contextual features. We introduce the Local Confidence Drop, a novel metric designed to capture sudden stability dips between neighboring tokens. Evaluated on the FLEURS dataset, our fandom forest classifier achieves 0.64 AP, consistently outperforming the baseline (p < 0.001). Crucially, we demonstrate that hallucinations manifest as local contextual discontinuities, providing a transparent alternative to opaque neural post-processors.
APA
Corpataux, S., Scius-Bertrand, A. & Wolf, B.. (2026). Detecting Whisper Hallucinations with Local Confidence Contrasts. Proceedings of the Fourth Swiss AI Days, in Proceedings of Machine Learning Research 309:38-45 Available from https://proceedings.mlr.press/v309/corpataux26a.html.

Related Material