[edit]
Tonative: Community-Driven Extension of African Datasets Through Human-AI Collaboration
Proceedings of the AI for African Languages Conference 2025, PMLR 314:33-36, 2026.
Abstract
Sustainable creation of language resources for African languages remains a major challenge, leaving many languages severely low-resource. While community-driven approaches are effective, they are difficult to scale, and purely synthetic data risks introducing translation artifacts and bias. This paper presents Tonative, a human-AI collaborative framework that extends existing datasets by translating them into additional African languages. The approach combines automated translation with community-based validation, reducing human workload while preserving linguistic authenticity. The proposed framework supports more sustainable and scalable development of African language resources.