Findings of the Second AmericasNLP Competition on Speech-to-Text Translation

Abteen Ebrahimi, Manuel Mager, Adam Wiemerslage, Pavel Denisov, Arturo Oncevay, Danni Liu, Sai Koneru, Enes Yavuz Ugan, Zhaolin Li, Jan Niehues, Monica Romero, Ivan G Torre, Tanel Alumäe, Jiaming Kong, Sergey Polezhaev, Yury Belousov, Wei-Rui Chen, Peter Sullivan, Ife Adebara, Bashar Talafha, Alcides Alcoba Inciarte, Muhammad Abdul-Mageed, Luis Chiruzzo, Rolando Coto-Solano, Hilaria Cruz, Sofía Flores-Solórzano, Aldo Andrés Alvarez López, Ivan Meza-Ruiz, John E. Ortega, Alexis Palmer, Rodolfo Joel Zevallos Salazar, Kristine Stenzel, Thang Vu, Katharina Kann
Proceedings of the NeurIPS 2022 Competitions Track, PMLR 220:217-232, 2022.

Abstract

Indigenous languages, including those from the Americas, have received very little attention from the machine learning (ML) and natural language processing (NLP) communities. To tackle the resulting lack of systems for these languages and the accompanying social inequalities affecting their speakers, we conduct the second AmericasNLP competition (and the first one in collaboration with NeurIPS), which is centered around speech-to-text translation systems for Indigenous languages of the Americas. The competition features three tasks – (1) automatic speech recognition, (2) text-based machine translation, and (3) speech-to-text translation – and two tracks: constrained and unconstrained. Five Indigenous languages are covered: Bribri, Guarani, Kotiria, Wa’ikhana, and Quechua. In this overview paper, we describe the tasks, tracks, and languages, introduce the baseline and participating systems, and end with a summary of ongoing and future challenges for the automatic translation of Indigenous languages.

Cite this Paper


BibTeX
@InProceedings{pmlr-v220-ebrahimi23a, title = {Findings of the Second AmericasNLP Competition on Speech-to-Text Translation}, author = {Ebrahimi, Abteen and Mager, Manuel and Wiemerslage, Adam and Denisov, Pavel and Oncevay, Arturo and Liu, Danni and Koneru, Sai and Ugan, Enes Yavuz and Li, Zhaolin and Niehues, Jan and Romero, Monica and Torre, Ivan G and Alum\"{a}e, Tanel and Kong, Jiaming and Polezhaev, Sergey and Belousov, Yury and Chen, Wei-Rui and Sullivan, Peter and Adebara, Ife and Talafha, Bashar and Inciarte, Alcides Alcoba and Abdul-Mageed, Muhammad and Chiruzzo, Luis and Coto-Solano, Rolando and Cruz, Hilaria and Flores-Sol\'{o}rzano, Sof\'{i}a and L\'{o}pez, Aldo Andr\'{e}s Alvarez and Meza-Ruiz, Ivan and Ortega, John E. and Palmer, Alexis and Salazar, Rodolfo Joel Zevallos and Stenzel, Kristine and Vu, Thang and Kann, Katharina}, booktitle = {Proceedings of the NeurIPS 2022 Competitions Track}, pages = {217--232}, year = {2022}, editor = {Ciccone, Marco and Stolovitzky, Gustavo and Albrecht, Jacob}, volume = {220}, series = {Proceedings of Machine Learning Research}, month = {28 Nov--09 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v220/ebrahimi23a/ebrahimi23a.pdf}, url = {https://proceedings.mlr.press/v220/ebrahimi23a.html}, abstract = {Indigenous languages, including those from the Americas, have received very little attention from the machine learning (ML) and natural language processing (NLP) communities. To tackle the resulting lack of systems for these languages and the accompanying social inequalities affecting their speakers, we conduct the second AmericasNLP competition (and the first one in collaboration with NeurIPS), which is centered around speech-to-text translation systems for Indigenous languages of the Americas. The competition features three tasks – (1) automatic speech recognition, (2) text-based machine translation, and (3) speech-to-text translation – and two tracks: constrained and unconstrained. Five Indigenous languages are covered: Bribri, Guarani, Kotiria, Wa’ikhana, and Quechua. In this overview paper, we describe the tasks, tracks, and languages, introduce the baseline and participating systems, and end with a summary of ongoing and future challenges for the automatic translation of Indigenous languages.} }
Endnote
%0 Conference Paper %T Findings of the Second AmericasNLP Competition on Speech-to-Text Translation %A Abteen Ebrahimi %A Manuel Mager %A Adam Wiemerslage %A Pavel Denisov %A Arturo Oncevay %A Danni Liu %A Sai Koneru %A Enes Yavuz Ugan %A Zhaolin Li %A Jan Niehues %A Monica Romero %A Ivan G Torre %A Tanel Alumäe %A Jiaming Kong %A Sergey Polezhaev %A Yury Belousov %A Wei-Rui Chen %A Peter Sullivan %A Ife Adebara %A Bashar Talafha %A Alcides Alcoba Inciarte %A Muhammad Abdul-Mageed %A Luis Chiruzzo %A Rolando Coto-Solano %A Hilaria Cruz %A Sofía Flores-Solórzano %A Aldo Andrés Alvarez López %A Ivan Meza-Ruiz %A John E. Ortega %A Alexis Palmer %A Rodolfo Joel Zevallos Salazar %A Kristine Stenzel %A Thang Vu %A Katharina Kann %B Proceedings of the NeurIPS 2022 Competitions Track %C Proceedings of Machine Learning Research %D 2022 %E Marco Ciccone %E Gustavo Stolovitzky %E Jacob Albrecht %F pmlr-v220-ebrahimi23a %I PMLR %P 217--232 %U https://proceedings.mlr.press/v220/ebrahimi23a.html %V 220 %X Indigenous languages, including those from the Americas, have received very little attention from the machine learning (ML) and natural language processing (NLP) communities. To tackle the resulting lack of systems for these languages and the accompanying social inequalities affecting their speakers, we conduct the second AmericasNLP competition (and the first one in collaboration with NeurIPS), which is centered around speech-to-text translation systems for Indigenous languages of the Americas. The competition features three tasks – (1) automatic speech recognition, (2) text-based machine translation, and (3) speech-to-text translation – and two tracks: constrained and unconstrained. Five Indigenous languages are covered: Bribri, Guarani, Kotiria, Wa’ikhana, and Quechua. In this overview paper, we describe the tasks, tracks, and languages, introduce the baseline and participating systems, and end with a summary of ongoing and future challenges for the automatic translation of Indigenous languages.
APA
Ebrahimi, A., Mager, M., Wiemerslage, A., Denisov, P., Oncevay, A., Liu, D., Koneru, S., Ugan, E.Y., Li, Z., Niehues, J., Romero, M., Torre, I.G., Alumäe, T., Kong, J., Polezhaev, S., Belousov, Y., Chen, W., Sullivan, P., Adebara, I., Talafha, B., Inciarte, A.A., Abdul-Mageed, M., Chiruzzo, L., Coto-Solano, R., Cruz, H., Flores-Solórzano, S., López, A.A.A., Meza-Ruiz, I., Ortega, J.E., Palmer, A., Salazar, R.J.Z., Stenzel, K., Vu, T. & Kann, K.. (2022). Findings of the Second AmericasNLP Competition on Speech-to-Text Translation. Proceedings of the NeurIPS 2022 Competitions Track, in Proceedings of Machine Learning Research 220:217-232 Available from https://proceedings.mlr.press/v220/ebrahimi23a.html.

Related Material