Bridging the Language Gap: Fine-Tuning Llama for Machine Translation in Low-Resource African Languages

Mary Wambui Kariuki; Joseph Muguro; Ciira wa Maina; Lilian Diana Awuor Wanzare

Bridging the Language Gap: Fine-Tuning Llama for Machine Translation in Low-Resource African Languages

Mary Wambui Kariuki, Joseph Muguro, Ciira wa Maina, Lilian Diana Awuor Wanzare

Proceedings of the AI for African Languages Conference 2025, PMLR 314:37-40, 2026.

Abstract

We adapt a pretrained large language model to support Kikuyu, a low-resource African language. A dataset of 140,000 English-Swahili-Kikuyu sentence pairs was collected across multiple domains, with a 30,000 sentence English-Kikuyu subset used for training. After preprocessing and normalization, the Llama 3.2 (3B) model was fine-tuned using parameter-efficient techniques. The resulting system achieves a BLEU score of 25.21, demonstrating the effectiveness of transfer learning for low-resource machine translation.

Cite this Paper

BibTeX

@InProceedings{pmlr-v314-kariuki26a,
  title = 	 {Bridging the Language Gap: Fine-Tuning Llama for Machine Translation in Low-Resource African Languages},
  author =       {Kariuki, Mary Wambui and Muguro, Joseph and Maina, Ciira wa and Wanzare, Lilian Diana Awuor},
  booktitle = 	 {Proceedings of the AI for African Languages Conference 2025},
  pages = 	 {37--40},
  year = 	 {2026},
  editor = 	 {Bainomugisha, Engineer and Mwebaze, Ernest and Kimera, Richard and Nabende, Joyce Nakatumba and Katumba, Andrew and Quinn, John},
  volume = 	 {314},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10 Oct},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v314/main/assets/kariuki26a/kariuki26a.pdf},
  url = 	 {https://proceedings.mlr.press/v314/kariuki26a.html},
  abstract = 	 {We adapt a pretrained large language model to support Kikuyu, a low-resource African language. A dataset of 140,000 English-Swahili-Kikuyu sentence pairs was collected across multiple domains, with a 30,000 sentence English-Kikuyu subset used for training. After preprocessing and normalization, the Llama 3.2 (3B) model was fine-tuned using parameter-efficient techniques. The resulting system achieves a BLEU score of 25.21, demonstrating the effectiveness of transfer learning for low-resource machine translation.}
}

Endnote

%0 Conference Paper
%T Bridging the Language Gap: Fine-Tuning Llama for Machine Translation in Low-Resource African Languages
%A Mary Wambui Kariuki
%A Joseph Muguro
%A Ciira wa Maina
%A Lilian Diana Awuor Wanzare
%B Proceedings of the AI for African Languages Conference 2025
%C Proceedings of Machine Learning Research
%D 2026
%E Engineer Bainomugisha
%E Ernest Mwebaze
%E Richard Kimera
%E Joyce Nakatumba Nabende
%E Andrew Katumba
%E John Quinn	
%F pmlr-v314-kariuki26a
%I PMLR
%P 37--40
%U https://proceedings.mlr.press/v314/kariuki26a.html
%V 314
%X We adapt a pretrained large language model to support Kikuyu, a low-resource African language. A dataset of 140,000 English-Swahili-Kikuyu sentence pairs was collected across multiple domains, with a 30,000 sentence English-Kikuyu subset used for training. After preprocessing and normalization, the Llama 3.2 (3B) model was fine-tuned using parameter-efficient techniques. The resulting system achieves a BLEU score of 25.21, demonstrating the effectiveness of transfer learning for low-resource machine translation.

APA

Kariuki, M.W., Muguro, J., Maina, C.w. & Wanzare, L.D.A.. (2026). Bridging the Language Gap: Fine-Tuning Llama for Machine Translation in Low-Resource African Languages. Proceedings of the AI for African Languages Conference 2025, in Proceedings of Machine Learning Research 314:37-40 Available from https://proceedings.mlr.press/v314/kariuki26a.html.

Related Material

Download PDF