Africa is home to more than a quarter of the world's languages, yet many of these languages are absent in the development of artificial intelligence (AI). This gap is largely due to insufficient investment and a lack of accessible data.
Current AI tools predominantly utilize English and a few other European and Chinese languages, which have extensive text data available online. In contrast, the majority of African languages are primarily oral, resulting in a scarcity of written content needed for effective AI training.
This linguistic disparity leaves millions at a disadvantage, unable to benefit from technological advancements. However, researchers have made significant strides to alleviate this issue by releasing the largest known dataset of African languages.
Prof Vukosi Marivate of the University of Pretoria explains, If technology doesn't reflect our languages, a whole group risks being left behind. The African Next Voices project, funded by a $2.2 million Gates Foundation grant, aims to create AI-ready datasets in 18 languages, representing a small fraction of the continent's over 2,000 languages.
In just two years, the initiative has recorded 9,000 hours of speech across Kenya, Nigeria, and South Africa, capturing real-life contexts in farming, health, and education. The recorded languages include Kikuyu, Dholuo, Hausa, Yoruba, isiZulu, and Tshivenda, speaking to millions of individuals.
The hope is that the data will spur further developments in AI tools tailored for African languages, enabling comprehensive solutions for various sectors. Farmer Kelebogile Mosime highlights the impact of such initiatives, sharing her experience using the AI-Farmer app, which supports several local languages.
As venture-backed companies like Lelapa AI emerge, there’s a clear movement towards utilizing AI for meaningful connections with the African populace, ensuring equitable access to essential services. As Prof Marivate poignantly states, Without indigenous languages, we not only lose data—we lose perspectives, histories, and cultures that shape our understanding of the world.