Handling Missing and Unreliable Information in Speech Recognition

Phil D. Green, Jon Barker, Martin Cooke, Ljubomir Josifovski
Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics, PMLR R3:112-116, 2001.

Abstract

In this work, techniques for classification with missing or unreliable data are applied to the problem of noise-robustness in Automatic Speech Recognition (ASR). The primary advantage of this viewpoint is that it makes minimal assumptions about any noise background. As motivation, we review evidence that the auditory system is capable of dealing with incomplete data and, indeed, does so in normal listening conditions. We formulate the unreliable classification problem and show how it can be expressed in the framework of Continuous Density Hidden Markov Models for statistical ASR. We describe experiments on connected digit recognition in noise in which encouraging results are obtained. Results are improved by ’softening’ the missing data decision. We argue that if the noise background is unpredictable it is necessary to integrate primitive processes which identify coherent spectraltemporal regions likely to be dominated by a single source with a generalised recognition decode which searches for the best sub-set of regions which match a speech source. We describe an implementation of a multi-source decoder using missing data recognition and show how it improves recognition results for non-stationary noises.

Cite this Paper


BibTeX
@InProceedings{pmlr-vR3-green01a, title = {Handling Missing and Unreliable Information in Speech Recognition}, author = {Green, Phil D. and Barker, Jon and Cooke, Martin and Josifovski, Ljubomir}, booktitle = {Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics}, pages = {112--116}, year = {2001}, editor = {Richardson, Thomas S. and Jaakkola, Tommi S.}, volume = {R3}, series = {Proceedings of Machine Learning Research}, month = {04--07 Jan}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/r3/green01a/green01a.pdf}, url = {https://proceedings.mlr.press/r3/green01a.html}, abstract = {In this work, techniques for classification with missing or unreliable data are applied to the problem of noise-robustness in Automatic Speech Recognition (ASR). The primary advantage of this viewpoint is that it makes minimal assumptions about any noise background. As motivation, we review evidence that the auditory system is capable of dealing with incomplete data and, indeed, does so in normal listening conditions. We formulate the unreliable classification problem and show how it can be expressed in the framework of Continuous Density Hidden Markov Models for statistical ASR. We describe experiments on connected digit recognition in noise in which encouraging results are obtained. Results are improved by ’softening’ the missing data decision. We argue that if the noise background is unpredictable it is necessary to integrate primitive processes which identify coherent spectraltemporal regions likely to be dominated by a single source with a generalised recognition decode which searches for the best sub-set of regions which match a speech source. We describe an implementation of a multi-source decoder using missing data recognition and show how it improves recognition results for non-stationary noises.}, note = {Reissued by PMLR on 31 March 2021.} }
Endnote
%0 Conference Paper %T Handling Missing and Unreliable Information in Speech Recognition %A Phil D. Green %A Jon Barker %A Martin Cooke %A Ljubomir Josifovski %B Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2001 %E Thomas S. Richardson %E Tommi S. Jaakkola %F pmlr-vR3-green01a %I PMLR %P 112--116 %U https://proceedings.mlr.press/r3/green01a.html %V R3 %X In this work, techniques for classification with missing or unreliable data are applied to the problem of noise-robustness in Automatic Speech Recognition (ASR). The primary advantage of this viewpoint is that it makes minimal assumptions about any noise background. As motivation, we review evidence that the auditory system is capable of dealing with incomplete data and, indeed, does so in normal listening conditions. We formulate the unreliable classification problem and show how it can be expressed in the framework of Continuous Density Hidden Markov Models for statistical ASR. We describe experiments on connected digit recognition in noise in which encouraging results are obtained. Results are improved by ’softening’ the missing data decision. We argue that if the noise background is unpredictable it is necessary to integrate primitive processes which identify coherent spectraltemporal regions likely to be dominated by a single source with a generalised recognition decode which searches for the best sub-set of regions which match a speech source. We describe an implementation of a multi-source decoder using missing data recognition and show how it improves recognition results for non-stationary noises. %Z Reissued by PMLR on 31 March 2021.
APA
Green, P.D., Barker, J., Cooke, M. & Josifovski, L.. (2001). Handling Missing and Unreliable Information in Speech Recognition. Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research R3:112-116 Available from https://proceedings.mlr.press/r3/green01a.html. Reissued by PMLR on 31 March 2021.

Related Material