Convert Video to Text

Upload a video, we pull the audio track and transcribe it. Subtitles (SRT/VTT) or a plain transcript — free, no sign-up.

No sign-up No watermark TXT · SRT · VTT exports Files auto-delete in 24h
Drag & drop your file here
or browse your files
Free: {0} files a day · up to {1} min & {2} MB each

Bigger files or more uploads? Free account: 5 files/day, 1-hour files · Pro: 10-hour files + speaker labels

0%
Uploading…
Keep this tab open — you'll be redirected to your transcript.
MP4MOVMKVWebMAVIWMVMPEG3GP

How video transcription works here

When you upload a video, the server reads just the audio stream out of the container — your video is never re-encoded, resized, or altered, and the picture content is never analyzed. That makes the process fast even for big files: a 2 GB screen recording with 20 minutes of speech transcribes as quickly as a 20-minute MP3.

The result is a timestamped transcript you can export as SRT or VTT subtitle files ready for YouTube, Premiere, Resolve, or an HTML5 player — or as plain TXT when you just need the words.

Meetings, screen recordings, lectures, footage

Typical uploads here are Zoom and Teams recordings, OBS screen captures, lecture recordings, and phone footage. All of these carry compressed AAC audio that the model handles well. If a video has several audio tracks (some screen recorders save mic and system audio separately), the first track is used — export with tracks merged if you need both sides of a call.

Frequently asked questions

How is the audio extracted from my video?
Server-side, with a standard demuxer: we read the existing audio stream out of the container without re-encoding the video. Your file isn't converted, compressed, or watermarked — the video track is simply ignored, and only the speech is analyzed.
Which video containers are supported?
MP4, MOV, MKV, WebM, AVI, WMV, MPEG/MPG, and 3GP. That covers phone videos (MP4/MOV), screen recorders (MKV/MP4/WebM), meeting-app downloads (MP4), and most camera footage. If you have something exotic, remuxing to MP4 with any free tool will do it.
Do you keep my video?
Anonymous uploads are deleted automatically 24 hours after upload — the video file and the transcript both. Account uploads stay in your library until you delete them. Either way the video is only ever stored to be transcribed; it isn't used for training or shared anywhere.
Should I export a transcript or subtitles?
Both come from the same result. Choose TXT when you want readable prose — meeting notes, quotes, an article draft. Choose SRT or VTT when the text goes back under the video: SRT for YouTube uploads and most editors, VTT for web players. You can download all of them from the same editor.
My video file is huge — any tips?
Duration is the real limit, not gigabytes, but upload time is on you. If your connection is slow, extract the audio first (most editors export audio-only in seconds) and upload that instead — a 60 MB M4A uploads far faster than the 4 GB video it came from, and the transcript is identical.
What about videos with loud music beds?
Speech over a moderate music bed usually transcribes fine — the model is good at focusing on the voice. Where it struggles is music louder than the speech, heavy ducking, or sung lyrics mixed with dialogue. Expect to fix more in the editor on those sections.