Closing the Gap in Low-Resource ASR: Leveraging Multilingual Models for Code-Switched Yoruba-English Speech

Emmanuel Bolarinwa, Oreoluwa Babatunde, Victor Olufemi, Kausar Moshood, Oluwademilade Williams
DLI 2025 Research Track, PMLR 302:1-9, 2026.

Abstract

Recent advancements in Automatic Speech Recognition (ASR) have revolutionized voice-based technologies, yet challenges persist in achieving accurate recognition for multilingual and low-resource languages. This research explores the performance of state-of-the-art multilingual ASR models (Whisper Large v3 and MMS-1B-All) on Yoruba-English code-switched (CS) speech. Despite notable progress in multilingual ASR, code-switching remains a complex challenge due to the linguistic intricacies introduced by phonetic, syntactic, and lexical shifts within single utterances. This study addresses a significant gap in the literature by evaluating these models on a 21-hour Yoruba-English dataset and finetuned for domain-specific performance. Results show that fine-tuning led to substantial improvements in Word Error Rate (WER), with MMS-1B-All achieving a 55.8% reduction and Whisper Large v3 showing a 50.1% reduction. Although MMS-1B-All outperformed Whisper Large v3 slightly, both models demonstrated strong potential for ASR in Yoruba-English CS speech recognition. This study highlights the feasibility of fine-tuning multilingual ASR models for low-resource code-switched scenarios and suggests directions for future research, including dataset expansion, alternative fine-tuning strategies, and real-time performance evaluation. Keywords: automatic speech recognition, code-switching, multilingual ASR, low-resource languages

Cite this Paper


BibTeX
@InProceedings{pmlr-v302-bolarinwa26a, title = {Closing the Gap in Low-Resource ASR: Leveraging Multilingual Models for Code-Switched Yoruba-English Speech}, author = {Bolarinwa, Emmanuel and Babatunde, Oreoluwa and Olufemi, Victor and Moshood, Kausar and Williams, Oluwademilade}, booktitle = {DLI 2025 Research Track}, pages = {1--9}, year = {2026}, editor = {Haddad, Hatem and Kahira, Albert Njoroge and Bourhim, Sofia and Olatunji, Iyiola Emmanuel and Makhafola, Lesego and Mwase, Christine}, volume = {302}, series = {Proceedings of Machine Learning Research}, month = {17--22 Aug}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v302/main/assets/bolarinwa26a/bolarinwa26a.pdf}, url = {https://proceedings.mlr.press/v302/bolarinwa26a.html}, abstract = {Recent advancements in Automatic Speech Recognition (ASR) have revolutionized voice-based technologies, yet challenges persist in achieving accurate recognition for multilingual and low-resource languages. This research explores the performance of state-of-the-art multilingual ASR models (Whisper Large v3 and MMS-1B-All) on Yoruba-English code-switched (CS) speech. Despite notable progress in multilingual ASR, code-switching remains a complex challenge due to the linguistic intricacies introduced by phonetic, syntactic, and lexical shifts within single utterances. This study addresses a significant gap in the literature by evaluating these models on a 21-hour Yoruba-English dataset and finetuned for domain-specific performance. Results show that fine-tuning led to substantial improvements in Word Error Rate (WER), with MMS-1B-All achieving a 55.8% reduction and Whisper Large v3 showing a 50.1% reduction. Although MMS-1B-All outperformed Whisper Large v3 slightly, both models demonstrated strong potential for ASR in Yoruba-English CS speech recognition. This study highlights the feasibility of fine-tuning multilingual ASR models for low-resource code-switched scenarios and suggests directions for future research, including dataset expansion, alternative fine-tuning strategies, and real-time performance evaluation. Keywords: automatic speech recognition, code-switching, multilingual ASR, low-resource languages} }
Endnote
%0 Conference Paper %T Closing the Gap in Low-Resource ASR: Leveraging Multilingual Models for Code-Switched Yoruba-English Speech %A Emmanuel Bolarinwa %A Oreoluwa Babatunde %A Victor Olufemi %A Kausar Moshood %A Oluwademilade Williams %B DLI 2025 Research Track %C Proceedings of Machine Learning Research %D 2026 %E Hatem Haddad %E Albert Njoroge Kahira %E Sofia Bourhim %E Iyiola Emmanuel Olatunji %E Lesego Makhafola %E Christine Mwase %F pmlr-v302-bolarinwa26a %I PMLR %P 1--9 %U https://proceedings.mlr.press/v302/bolarinwa26a.html %V 302 %X Recent advancements in Automatic Speech Recognition (ASR) have revolutionized voice-based technologies, yet challenges persist in achieving accurate recognition for multilingual and low-resource languages. This research explores the performance of state-of-the-art multilingual ASR models (Whisper Large v3 and MMS-1B-All) on Yoruba-English code-switched (CS) speech. Despite notable progress in multilingual ASR, code-switching remains a complex challenge due to the linguistic intricacies introduced by phonetic, syntactic, and lexical shifts within single utterances. This study addresses a significant gap in the literature by evaluating these models on a 21-hour Yoruba-English dataset and finetuned for domain-specific performance. Results show that fine-tuning led to substantial improvements in Word Error Rate (WER), with MMS-1B-All achieving a 55.8% reduction and Whisper Large v3 showing a 50.1% reduction. Although MMS-1B-All outperformed Whisper Large v3 slightly, both models demonstrated strong potential for ASR in Yoruba-English CS speech recognition. This study highlights the feasibility of fine-tuning multilingual ASR models for low-resource code-switched scenarios and suggests directions for future research, including dataset expansion, alternative fine-tuning strategies, and real-time performance evaluation. Keywords: automatic speech recognition, code-switching, multilingual ASR, low-resource languages
APA
Bolarinwa, E., Babatunde, O., Olufemi, V., Moshood, K. & Williams, O.. (2026). Closing the Gap in Low-Resource ASR: Leveraging Multilingual Models for Code-Switched Yoruba-English Speech. DLI 2025 Research Track, in Proceedings of Machine Learning Research 302:1-9 Available from https://proceedings.mlr.press/v302/bolarinwa26a.html.

Related Material