Keep this tab open — you'll be redirected to your transcript.
MSA فصحىEgyptianLevantineGulfRTL output
MSA is the strong case; dialects vary
Modern Standard Arabic (فصحى) — news, speeches, lectures, formal interviews — transcribes at a high tier. That's where the training data is richest and where you can expect near-European-language accuracy on clear audio.
Spoken Arabic is diglossic, and the dialects are where honesty matters: Egyptian and Levantine (the media-heavy dialects) transcribe reasonably well, often lightly normalized toward MSA spellings. Gulf dialects are serviceable. Maghrebi (Moroccan/Algerian darija) is the hard case — heavy French/Berber mixing and phonology far from MSA mean noticeably rougher output. For darija recordings, budget real editor time.
Right-to-left text, hamza, and taa marbuta
Output is proper right-to-left Arabic script with standard orthography — hamza seats, taa marbuta, and alif variants written conventionally. Short vowels (harakat) are not written, as in normal Arabic text. The editor and all exports handle RTL correctly, and SRT/VTT files carry the Arabic text ready for players that support RTL subtitles (most modern ones do).
Frequently asked questions
Which Arabic does this handle best?
Modern Standard Arabic, clearly — newscasts, speeches, and formal registers transcribe at the model's high tier. Among dialects: Egyptian and Levantine best (media exposure), Gulf next, Maghrebi darija weakest. Mixed MSA/dialect speech, common in interviews, comes out mostly right with dialect words sometimes normalized.
Will Moroccan/Algerian darija work?
Set expectations low: darija's French and Amazigh mixing, plus phonology far from MSA, make it the hardest mainstream Arabic variant. You'll get a usable skeleton for clear speech but should plan on editing. For formal Maghrebi speech (news, education), accuracy is much better.
Does the transcript include harakat (short vowels)?
No — like virtually all written Arabic, the output is unvocalized: consonants and long vowels only, with hamza and taa marbuta written correctly. That's the standard for readable Arabic text; fully vocalized output isn't something speech models produce reliably.
Is right-to-left handled properly in exports?
Yes. The editor renders RTL naturally, TXT exports are plain UTF-8 Arabic text, and SRT/VTT subtitle files work in players with RTL support (YouTube, VLC, modern web players). Mixed-direction lines — Arabic with an English brand name — follow standard bidi rendering.
Arabic audio to English text?
This page transcribes Arabic into Arabic — the accurate step. For English, export the TXT and machine-translate it; ar→en translation on clean text is far ahead of one-shot speech translation, especially for MSA. Transcript translation inside the app is on the roadmap.
Quran recitation or religious audio?
Tajwid recitation is sung/melodic and highly stylized — the model often recognizes the text (much of it is in training data) but this isn't a reliable Quran transcription tool, and output won't carry correct vocalization. For khutbas, lectures, and du'a in spoken register, it performs like normal MSA/dialect speech.