Transcribe Arabic Audio to Text

العربية to text — Modern Standard Arabic at high accuracy, major dialects with honest caveats. Free, no account needed.

No sign-up No watermark TXT · SRT · VTT exports Files auto-delete in 24h

Drag & drop your file here

or browse your files

Free: {0} files a day · up to {1} min & {2} MB each

Arabic (العربية) Language pre-set for this page

Bigger files or more uploads? Free account: 5 files/day, 1-hour files · Pro: 10-hour files + speaker labels

MSA فصحىEgyptianLevantineGulfRTL output

MSA is the strong case; dialects vary

Modern Standard Arabic (فصحى) — news, speeches, lectures, formal interviews — transcribes at a high tier. That's where the training data is richest and where you can expect near-European-language accuracy on clear audio.

Spoken Arabic is diglossic, and the dialects are where honesty matters: Egyptian and Levantine (the media-heavy dialects) transcribe reasonably well, often lightly normalized toward MSA spellings. Gulf dialects are serviceable. Maghrebi (Moroccan/Algerian darija) is the hard case — heavy French/Berber mixing and phonology far from MSA mean noticeably rougher output. For darija recordings, budget real editor time.

Right-to-left text, hamza, and taa marbuta

Output is proper right-to-left Arabic script with standard orthography — hamza seats, taa marbuta, and alif variants written conventionally. Short vowels (harakat) are not written, as in normal Arabic text. The editor and all exports handle RTL correctly, and SRT/VTT files carry the Arabic text ready for players that support RTL subtitles (most modern ones do).

Frequently asked questions

Which Arabic does this handle best?

Modern Standard Arabic, clearly — newscasts, speeches, and formal registers transcribe at the model's high tier. Among dialects: Egyptian and Levantine best (media exposure), Gulf next, Maghrebi darija weakest. Mixed MSA/dialect speech, common in interviews, comes out mostly right with dialect words sometimes normalized.

Will Moroccan/Algerian darija work?

Set expectations low: darija's French and Amazigh mixing, plus phonology far from MSA, make it the hardest mainstream Arabic variant. You'll get a usable skeleton for clear speech but should plan on editing. For formal Maghrebi speech (news, education), accuracy is much better.

Does the transcript include harakat (short vowels)?

No — like virtually all written Arabic, the output is unvocalized: consonants and long vowels only, with hamza and taa marbuta written correctly. That's the standard for readable Arabic text; fully vocalized output isn't something speech models produce reliably.

Is right-to-left handled properly in exports?

Yes. The editor renders RTL naturally, TXT exports are plain UTF-8 Arabic text, and SRT/VTT subtitle files work in players with RTL support (YouTube, VLC, modern web players). Mixed-direction lines — Arabic with an English brand name — follow standard bidi rendering.

Arabic audio to English text?

This page transcribes Arabic into Arabic — the accurate step. For English, export the TXT and machine-translate it; ar→en translation on clean text is far ahead of one-shot speech translation, especially for MSA. Transcript translation inside the app is on the roadmap.

Quran recitation or religious audio?

Tajwid recitation is sung/melodic and highly stylized — the model often recognizes the text (much of it is in training data) but this isn't a reliable Quran transcription tool, and output won't carry correct vocalization. For khutbas, lectures, and du'a in spoken register, it performs like normal MSA/dialect speech.