← Back to Orinode

Nigerian Pidgin (Naija) Speech Recognition

Orinode's Pidgin ASR is one of the strongest results in Maraba v1: 9.8% WER on 200 spontaneous samples (May 2026). Nigerian Pidgin — locally Naija — is the most widely spoken Nigerian language with an estimated 75+ million speakers, yet it has no dedicated language slot in any global ASR model. We made one.

Looking for the deployable product? This Pidgin ASR powers Maraba — an AI call agent that answers Naija callers in their actual register, with Pidgin↔English handled mid-sentence.

What Nigerian Pidgin is

Nigerian Pidgin (ISO 639-3 pcm) is a creole that emerged from English contact with West African languages — predominantly Yorùbá, Igbo, Hausa, Edo, and Efik. It is grammatically and phonologically distinct from any variant of English:

Why standard ASR fails on Pidgin

Whisper, Google Speech-to-Text, and Azure all classify Pidgin audio as "broken English" and transcribe it phonetically against an English lexicon — producing output like "we are going to the market" for we dey go market. Worse, they often refuse to transcribe at all when language detection mislabels the segment.

Orinode's approach

Performance (May 2026)

MetricValueN
WER (normalized)9.76%200
WER (raw, case-sensitive)9.90%200

Use cases

Pidgin ASR matters most for: customer-service call centers (BVN verification, banking complaints, telco issues), broadcast captioning for Naija-language radio/TV stations, and government services trying to be linguistically inclusive for the majority of Nigerians who use Pidgin as their working everyday language.

Get the model

Maraba v1 weights with the Pidgin (Hawaiian-slot) configuration: huggingface.co/Orinode. Production API: [email protected].

For the deployable voice agent built on this model, see Maraba at maraba.ai.