Bridging the Language Gap: Fine-Tuning Llama for Machine Translation in Low-Resource African Languages

Mary Wambui Kariuki, Joseph Muguro, Ciira wa Maina, Lilian Diana Awuor Wanzare
Proceedings of the AI for African Languages Conference 2025, PMLR 314:37-40, 2026.

Abstract

We adapt a pretrained large language model to support Kikuyu, a low-resource African language. A dataset of 140,000 English-Swahili-Kikuyu sentence pairs was collected across multiple domains, with a 30,000 sentence English-Kikuyu subset used for training. After preprocessing and normalization, the Llama 3.2 (3B) model was fine-tuned using parameter-efficient techniques. The resulting system achieves a BLEU score of 25.21, demonstrating the effectiveness of transfer learning for low-resource machine translation.

Cite this Paper


BibTeX
@InProceedings{pmlr-v314-kariuki26a, title = {Bridging the Language Gap: Fine-Tuning Llama for Machine Translation in Low-Resource African Languages}, author = {Kariuki, Mary Wambui and Muguro, Joseph and Maina, Ciira wa and Wanzare, Lilian Diana Awuor}, booktitle = {Proceedings of the AI for African Languages Conference 2025}, pages = {37--40}, year = {2026}, editor = {Bainomugisha, Engineer and Mwebaze, Ernest and Kimera, Richard and Nabende, Joyce Nakatumba and Katumba, Andrew and Quinn, John}, volume = {314}, series = {Proceedings of Machine Learning Research}, month = {10 Oct}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v314/main/assets/kariuki26a/kariuki26a.pdf}, url = {https://proceedings.mlr.press/v314/kariuki26a.html}, abstract = {We adapt a pretrained large language model to support Kikuyu, a low-resource African language. A dataset of 140,000 English-Swahili-Kikuyu sentence pairs was collected across multiple domains, with a 30,000 sentence English-Kikuyu subset used for training. After preprocessing and normalization, the Llama 3.2 (3B) model was fine-tuned using parameter-efficient techniques. The resulting system achieves a BLEU score of 25.21, demonstrating the effectiveness of transfer learning for low-resource machine translation.} }
Endnote
%0 Conference Paper %T Bridging the Language Gap: Fine-Tuning Llama for Machine Translation in Low-Resource African Languages %A Mary Wambui Kariuki %A Joseph Muguro %A Ciira wa Maina %A Lilian Diana Awuor Wanzare %B Proceedings of the AI for African Languages Conference 2025 %C Proceedings of Machine Learning Research %D 2026 %E Engineer Bainomugisha %E Ernest Mwebaze %E Richard Kimera %E Joyce Nakatumba Nabende %E Andrew Katumba %E John Quinn %F pmlr-v314-kariuki26a %I PMLR %P 37--40 %U https://proceedings.mlr.press/v314/kariuki26a.html %V 314 %X We adapt a pretrained large language model to support Kikuyu, a low-resource African language. A dataset of 140,000 English-Swahili-Kikuyu sentence pairs was collected across multiple domains, with a 30,000 sentence English-Kikuyu subset used for training. After preprocessing and normalization, the Llama 3.2 (3B) model was fine-tuned using parameter-efficient techniques. The resulting system achieves a BLEU score of 25.21, demonstrating the effectiveness of transfer learning for low-resource machine translation.
APA
Kariuki, M.W., Muguro, J., Maina, C.w. & Wanzare, L.D.A.. (2026). Bridging the Language Gap: Fine-Tuning Llama for Machine Translation in Low-Resource African Languages. Proceedings of the AI for African Languages Conference 2025, in Proceedings of Machine Learning Research 314:37-40 Available from https://proceedings.mlr.press/v314/kariuki26a.html.

Related Material