Mai Ho‘omāuna i ka ‘Ai: Language Models Improve Automatic Speech Recognition in Hawaiian

Kaavya D Chaparala; Guido Zarrella; Bruce Torres Fischer; Larry Kimura; Oiwi Parker Jones

Mai Ho‘omāuna i ka ‘Ai: Language Models Improve Automatic Speech Recognition in Hawaiian

Kaavya D Chaparala, Guido Zarrella, Bruce Torres Fischer, Larry Kimura, Oiwi Parker Jones

Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:576-583, 2024.

Abstract

In this paper we address the challenge of improving Automatic Speech Recognition (ASR) for a low-resource language, Hawaiian, by incorporating large amounts of independent text data into an ASR foundation model, Whisper. To do this, we train an external language model (LM) on ∼1.5M words of Hawaiian text. We then use the LM to rescore Whisper and compute word error rates (WERs) on a manually curated test set of labeled Hawaiian data. As a baseline, we use Whisper without an external LM. Experimental results reveal a small but significant improvement in WER when ASR outputs are rescored with a Hawaiian LM. The results support leveraging all available data in the development of ASR systems for underrepresented languages.

Cite this Paper

BibTeX

@InProceedings{pmlr-v262-d-chaparala24a,
  title = 	 {{Mai Ho‘omāuna i ka ‘Ai}: Language Models Improve Automatic Speech Recognition in Hawaiian },
  author =       {D Chaparala, Kaavya and Zarrella, Guido and Torres Fischer, Bruce and Kimura, Larry and Parker Jones, Oiwi},
  booktitle = 	 {Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop},
  pages = 	 {576--583},
  year = 	 {2024},
  editor = 	 {Rezagholizadeh, Mehdi and Passban, Peyman and Samiee, Soheila and Partovi Nia, Vahid and Cheng, Yu and Deng, Yue and Liu, Qun and Chen, Boxing},
  volume = 	 {262},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {14 Dec},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v262/main/assets/d-chaparala24a/d-chaparala24a.pdf},
  url = 	 {https://proceedings.mlr.press/v262/d-chaparala24a.html},
  abstract = 	 {In this paper we address the challenge of improving Automatic Speech Recognition (ASR) for a low-resource language, Hawaiian, by incorporating large amounts of independent text data into an ASR foundation model, Whisper. To do this, we train an external language model (LM) on ∼1.5M words of Hawaiian text. We then use the LM to rescore Whisper and compute word error rates (WERs) on a manually curated test set of labeled Hawaiian data. As a baseline, we use Whisper without an external LM. Experimental results reveal a small but significant improvement in WER when ASR outputs are rescored with a Hawaiian LM. The results support leveraging all available data in the development of ASR systems for underrepresented languages.}
}

Endnote

%0 Conference Paper
%T Mai Ho‘omāuna i ka ‘Ai: Language Models Improve Automatic Speech Recognition in Hawaiian 
%A Kaavya D Chaparala
%A Guido Zarrella
%A Bruce Torres Fischer
%A Larry Kimura
%A Oiwi Parker Jones
%B Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop
%C Proceedings of Machine Learning Research
%D 2024
%E Mehdi Rezagholizadeh
%E Peyman Passban
%E Soheila Samiee
%E Vahid Partovi Nia
%E Yu Cheng
%E Yue Deng
%E Qun Liu
%E Boxing Chen	
%F pmlr-v262-d-chaparala24a
%I PMLR
%P 576--583
%U https://proceedings.mlr.press/v262/d-chaparala24a.html
%V 262
%X In this paper we address the challenge of improving Automatic Speech Recognition (ASR) for a low-resource language, Hawaiian, by incorporating large amounts of independent text data into an ASR foundation model, Whisper. To do this, we train an external language model (LM) on ∼1.5M words of Hawaiian text. We then use the LM to rescore Whisper and compute word error rates (WERs) on a manually curated test set of labeled Hawaiian data. As a baseline, we use Whisper without an external LM. Experimental results reveal a small but significant improvement in WER when ASR outputs are rescored with a Hawaiian LM. The results support leveraging all available data in the development of ASR systems for underrepresented languages.

APA

D Chaparala, K., Zarrella, G., Torres Fischer, B., Kimura, L. & Parker Jones, O.. (2024). Mai Ho‘omāuna i ka ‘Ai: Language Models Improve Automatic Speech Recognition in Hawaiian . Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, in Proceedings of Machine Learning Research 262:576-583 Available from https://proceedings.mlr.press/v262/d-chaparala24a.html.

Related Material

Download PDF