Home
Vid2Txt¶
A Python package for transcribing videos/audios to text using various speech-to-text services. Currently supports AssemblyAI for high-quality transcription.
Features¶
- Download and transcribe from YouTube or any URL (via
yt-dlp) - Extract audio from video files using
FFmpeg - Direct support for audio formats (MP3, WAV, M4A, AAC, FLAC, OGG, WMA)
- Transcribe audio using AssemblyAI API
- Export transcripts in multiple formats:
- Plain text (.txt)
- SubRip subtitles (.srt)
- Interactive HTML (.html) with embedded video/audio player
- Language forcing support
Installation¶
pip install vid2txt
Setup¶
Set your AssemblyAI API key as an environment variable:
export ASSEMBLYAI_API_KEY="your-api-key-here"
# On Windows (PowerShell):
$env:ASSEMBLYAI_API_KEY="your-api-key-here"
Get a free API key from: https://www.assemblyai.com/dashboard/signup
Usage¶
Command Line Interface¶
vid2txt MEDIA_PATH [OPTIONS]
Where MEDIA_PATH can be:
- A local video file (.mp4, .mkv, .mov, ...)
- A local audio file (.mp3, .wav, ...)
- A YouTube or other URL
Options¶
-o, --output-dir: Output directory (default: same as input file)-l, --language: Force transcription language (e.g., 'en', 'ar', 'es')--model: Speech-to-text model to use (currently only 'assemblyai')--force-audio-extract: Force re-extraction of audio from video files--audio: Download audio only when using YouTube URLs (faster)
Examples¶
# Transcribe a local video file
vid2txt video.mp4
# Transcribe a local audio file
vid2txt podcast.mp3
# Transcribe from YouTube (downloads best video+audio)
vid2txt "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
# Download and transcribe audio only (faster)
vid2txt "https://www.youtube.com/watch?v=dQw4w9WgXcQ" --audio
# Specify output directory and language
vid2txt https://www.youtube.com/watch?v=dQw4w9WgXcQ -o ./output -l en
# Force re-extraction of audio even if cached
vid2txt video.mp4 --force-audio-extract
# Show help
vid2txt -h
Python API¶
from vid2txt import Transcriber
from pathlib import Path
media_path = Path("video.mp4") # or Path("audio.mp3"), or a URL
output_dir = Path("output") # Output directory
api_key = "your_api_key" # if using assemblyai
transcriber = Transcriber(
output_dir=output_dir,
language="en",
model="assemblyai",
api_key=api_key
)
segments = transcriber.transcribe(media_path=media_path)
# Save in different formats
transcriber.save_plain_text(
segments=segments,
out_path=output_dir / Path("transcript.txt")
)
transcriber.save_srt(
segments=segments,
out_path=output_dir / Path("transcript.srt")
)
transcriber.save_html(
segments=segments,
out_path=output_dir / Path("transcript.html"),
media_path=media_path
)
Examples¶
1 of 3