Zero-Shot LLM Generation of Energy Notifications for African Languages: A Benchmark Study

Hatem Haddad, Feres Jerbi, Issam Smaali
DLI 2025 Research Track, PMLR 302:1-10, 2026.

Abstract

Large Language Models (LLMs) have demonstrated significant advancements in natural language applications but often exhibit performance disparities for low-resource languages, particularly African languages underrepresented in training corpora. This paper addresses this gap by evaluating the zero-shot text generation capabilities of LLMs within the energy domain for six widely spoken African languages. We introduce a novel multilingual benchmark dataset of energy management notifications and use it to assess four recent open-source LLMs (1B-7B parameters). Employing a zero-shot learning approach with multiple prompts and established NLP metrics (Statistics-based, Model-based, Perplexity) without fine-tuning, our findings reveal varying model strengths across languages and metrics. For instance, while some models excel in content overlap (ROUGE) for languages like English and French, others show better fluency (Perplexity) or semantic similarity (BERTScore), with performance shifting notably for African languages.

Cite this Paper


BibTeX
@InProceedings{pmlr-v302-haddad26a, title = {Zero-Shot LLM Generation of Energy Notifications for African Languages: A Benchmark Study}, author = {Haddad, Hatem and Jerbi, Feres and Smaali, Issam}, booktitle = {DLI 2025 Research Track}, pages = {1--10}, year = {2026}, editor = {Haddad, Hatem and Kahira, Albert Njoroge and Bourhim, Sofia and Olatunji, Iyiola Emmanuel and Makhafola, Lesego and Mwase, Christine}, volume = {302}, series = {Proceedings of Machine Learning Research}, month = {17--22 Aug}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v302/main/assets/haddad26a/haddad26a.pdf}, url = {https://proceedings.mlr.press/v302/haddad26a.html}, abstract = {Large Language Models (LLMs) have demonstrated significant advancements in natural language applications but often exhibit performance disparities for low-resource languages, particularly African languages underrepresented in training corpora. This paper addresses this gap by evaluating the zero-shot text generation capabilities of LLMs within the energy domain for six widely spoken African languages. We introduce a novel multilingual benchmark dataset of energy management notifications and use it to assess four recent open-source LLMs (1B-7B parameters). Employing a zero-shot learning approach with multiple prompts and established NLP metrics (Statistics-based, Model-based, Perplexity) without fine-tuning, our findings reveal varying model strengths across languages and metrics. For instance, while some models excel in content overlap (ROUGE) for languages like English and French, others show better fluency (Perplexity) or semantic similarity (BERTScore), with performance shifting notably for African languages.} }
Endnote
%0 Conference Paper %T Zero-Shot LLM Generation of Energy Notifications for African Languages: A Benchmark Study %A Hatem Haddad %A Feres Jerbi %A Issam Smaali %B DLI 2025 Research Track %C Proceedings of Machine Learning Research %D 2026 %E Hatem Haddad %E Albert Njoroge Kahira %E Sofia Bourhim %E Iyiola Emmanuel Olatunji %E Lesego Makhafola %E Christine Mwase %F pmlr-v302-haddad26a %I PMLR %P 1--10 %U https://proceedings.mlr.press/v302/haddad26a.html %V 302 %X Large Language Models (LLMs) have demonstrated significant advancements in natural language applications but often exhibit performance disparities for low-resource languages, particularly African languages underrepresented in training corpora. This paper addresses this gap by evaluating the zero-shot text generation capabilities of LLMs within the energy domain for six widely spoken African languages. We introduce a novel multilingual benchmark dataset of energy management notifications and use it to assess four recent open-source LLMs (1B-7B parameters). Employing a zero-shot learning approach with multiple prompts and established NLP metrics (Statistics-based, Model-based, Perplexity) without fine-tuning, our findings reveal varying model strengths across languages and metrics. For instance, while some models excel in content overlap (ROUGE) for languages like English and French, others show better fluency (Perplexity) or semantic similarity (BERTScore), with performance shifting notably for African languages.
APA
Haddad, H., Jerbi, F. & Smaali, I.. (2026). Zero-Shot LLM Generation of Energy Notifications for African Languages: A Benchmark Study. DLI 2025 Research Track, in Proceedings of Machine Learning Research 302:1-10 Available from https://proceedings.mlr.press/v302/haddad26a.html.

Related Material