Hangul output, spacing, and particles
Korean writes with spaces, but the rules (띄어쓰기) are notoriously fiddly — particles attach to their noun, compound verbs split or join by convention. The transcript applies standard spacing automatically, attaches particles (은/는/이/가/을/를) correctly, and inserts sentence punctuation. Output is pure Hangul with Latin script for embedded English; hanja are not used, matching modern written Korean.
Honorific register (합쇼체, 해요체) and casual speech (반말) are transcribed as spoken — verb endings come out the way the speaker said them, which matters for interviews and variety-show content.
What transcribes well, and what to watch
Seoul-standard Korean — podcasts, lectures, meetings, YouTube — sits in the model's high tier. Fast variety-show banter with overlapping speakers and sound effects is the classic hard case (as it is for human transcribers). Satoori (regional dialects): Gyeongsang and Jeolla accents on standard vocabulary transcribe fine; deep Jeju dialect is effectively out of scope. Konglish loanwords (아이스아메리카노, 미팅) are written in Hangul as Koreans write them.
Frequently asked questions
Does the transcript follow Korean spacing rules?
Yes — 띄어쓰기 is applied by the language model to standard conventions: particles bound to their nouns, dependent nouns spaced, common compounds joined. It's the same standard a careful Korean writer would use, and far more consistent than raw speech models used to be.
Are particles and verb endings accurate?
Particles are a strong point — 이/가 vs 은/는 and object marking come out correctly from context. Verb endings are transcribed as spoken, preserving register: 합니다 stays 합니다, 반말 stays 반말. Contracted casual forms (뭐해 for 무엇을 해) are written the colloquial way.
How accurate is Korean overall?
High tier on clear speech — Korean is one of the model's better-covered Asian languages thanks to abundant media data. The gap to English shows up on fast overlapping conversation and slang-dense speech; single-speaker content (lectures, vlogs) is excellent.
What about satoori (regional dialects)?
Gyeongsang, Jeolla, and Chungcheong accents speaking near-standard Korean transcribe well — intonation differences don't hurt much. Heavy dialect vocabulary gets normalized or missed, and Jeju dialect (effectively a separate language) is out of scope.
Korean audio to English text?
This page produces the Korean transcript; for English, export TXT and machine-translate. ko→en translation on clean text is strong, and the two-step route preserves more nuance (especially register) than one-shot speech translation. In-app translation is on the roadmap.
Will it handle K-content — variety shows, K-pop, dramas?
Dramas and interviews: yes, that's normal speech. Variety shows: expect more errors — overlapping shouting, sound effects, and on-screen text culture make them hard even for humans. K-pop songs: singing is best-effort; clear ballad vocals do better than dense rap sections.