Why Malawi’s Chichewa AI push is part of a continent-wide race to end the language divide

Blessed Frank
Why Malawi's Chichewa AI push is part of a continent-wide race to end the language divide

When Alifosina Mtseteka, a smallholder sugarcane farmer in Malawi’s Chisemphere village, described a pest attacking her okra crop into a mobile phone, she received treatment advice in Chichewa, her own language. The AI chatbot called Ulangizi worked. She bought the chemicals, sprayed them, and the crop recovered. It was a small moment, but it pointed toward something large: the possibility that AI, long skewed toward English-speaking populations, could finally begin to serve the majority of Africans on their own linguistic terms.

That possibility is now the explicit goal of a formal government-backed initiative. Malawi has launched the Malawi Low-Resource Language Data Trust Initiative, a structured programme aimed at building AI systems that understand and respond in Chichewa, the national language spoken by roughly 70% of the country’s population and also widely used across Zambia, Mozambique and Zimbabwe. 

The initiative is developing core AI frameworks and assembling licensed Chichewa-language content to expand the training data available for AI applications, with support from the World Bank and the Gates Foundation.

The project aims to break long-standing language barriers by enabling AI-powered systems to deliver information and services in local languages, particularly benefiting rural communities where English is not the primary language of communication. Concretely, that means building datasets from government archives and media content to power voice and text chatbots, with media organisations including Nation Publications Limited contributing Chichewa content to help train the models.

Speaking at the launch workshop in Lilongwe, Feston Kaupa, Malawi’s minister of defence, delivering remarks on behalf of the ICT minister, made the case for urgency: countries that strategically adopt AI, he said, will be better positioned to drive innovation, strengthen productivity and compete in the global economy. 

Why Malawi's Chichewa AI push is part of a continent-wide race to end the language divide
Feston Kaupa, Malawi’s Minister of Defence

The statement was striking coming from a country ranked among the world’s poorest, but it reflects a hardening consensus across the continent, that AI is not a luxury but an infrastructure question and that missing it means falling further behind.

Malawi’s Chichewa AI solving the data problem

The challenge Malawi is trying to solve is structural. Most large language models currently available are trained on over 90% English text, pointing to a deep divide between English and all other languages worldwide. African languages have been particularly disadvantaged. Although about 21 million people speak Chichewa across Malawi, Zambia and Mozambique, the language remains low-resource because there is not enough digitised data on the internet to train machine learning models for tasks such as machine translation and automated speech recognition. 

The consequences of this gap are not abstract. When AI tools cannot understand local languages, they cannot serve local populations. Healthcare workers cannot use them to navigate clinical guidelines. Farmers cannot access pest management advice. Citizens cannot query government services. The digital economy, in effect, operates in a language that most people do not speak at home.

Training AI systems on Chichewa has proved difficult, the language’s low-resource status means early models sometimes sounded strange to native speakers, at one point rendering responses with an Indian accent. The developers of Ulangizi persisted and eventually produced a system now used by thousands of farmers. The lesson is instructive for the government’s broader effort: technical solutions exist, but they require sustained investment and community grounding.

Malawi’s strategy identifies agricultural resilience as a priority alongside healthcare and financial inclusion, sectors where language barriers have historically blocked access most acutely. The vision is for citizens to interact with AI applications in both text and voice, a design choice that matters enormously in a country where literacy rates and smartphone penetration vary widely, but where voice communication is near-universal.

The World Bank’s Global Data Facility has highlighted the development of AI-ready language data libraries as a forward-looking priority, recognising that as AI becomes increasingly embedded in development practice, ensuring language data is inclusive, representative and accessible is essential. Malawi’s initiative is a direct expression of that priority at the national level.

Continent-wide AI inclusion efforts

Malawi is not alone, a growing number of African nations and research communities are pushing back against AI’s language exclusion, each approaching the problem differently.

Lelapa AI, a pan-African initiative, has launched InkubaLM, billed as the first multilingual AI large language model tailored to African languages, focusing on Swahili, Yoruba, IsiXhosa, Hausa and IsiZulu. The company’s CEO, Pelonomi Moiloa, has argued pointedly that no one should have to adopt a foreign culture to access cutting-edge tools.

Community-led efforts like Masakhane are training transformer models such as AfriBERTa and SwahBERT on a range of African languages, including Amharic, Hausa, Swahili and Yoruba, with these models frequently outperforming multilingual baselines on downstream tasks. In Uganda, Sunbird AI has argued for a regionally focused approach to language AI development, presenting a case study for Uganda, a country with high linguistic diversity, as a model for building AI systems that serve communities with smaller speaker populations. 

In East Africa, Jacaranda Health expanded its open-source AI model UlizaLlama in 2024 to provide AI-driven maternal health support in Swahili, Hausa, Yoruba, Xhosa and Zulu. The programme targets new and expectant mothers in peri-urban Nairobi, precisely the population most likely to be excluded by English-only tools.

Why Malawi's Chichewa AI push is part of a continent-wide race to end the language divide

Most recently, the LINGUA Africa initiative, led by Microsoft’s AI for Good Lab in collaboration with the Gates Foundation, Masakhane and Google.org, has opened a funding call for projects focused on language datasets, AI models and real-world applications that increase digital inclusion across the continent. The programme reflects the arrival of major technology companies as partners and funders in the African language AI space, a shift that brings resources but also raises questions about data ownership and who ultimately controls the infrastructure.

Also read: Google to train AI in 21 African languages, including Yoruba, Hausa and Igbo

That question is part of what makes Malawi’s model notable. By framing its effort as a national data trust, with licensed content and government archives as inputs, the initiative asserts a degree of sovereignty over the underlying data that purely commercial approaches do not guarantee. 

Africa’s AI market is projected to grow from $4.51 billion in 2025 to $16.5 billion by 2030, and the countries that build their own language infrastructure now will be better placed to capture a share of that value rather than remain dependent on external systems that do not speak their languages.


Technext Newsletter

Get the best of Africa’s daily tech to your inbox – first thing every morning.
Join the community now!

Register for Technext Coinference 2023, the Largest blockchain and DeFi Gathering in Africa.

Technext Newsletter

Get the best of Africa’s daily tech to your inbox – first thing every morning.
Join the community now!