Tools/Music & Audio Generation/Whisper Audio (Transcription+)

Whisper Audio (Transcription+)

Audio processing toolkit building on Whisper for diarization and subtitling.

Open SourceSelf HostedOffline CapableGPU Required (4GB+ VRAM)
0.0 (0)

About

Stable-ts extends OpenAI Whisper to produce more reliable word and segment timestamps and adds tools for transcription workflows. It refines Whisper's native timing, supports regrouping and editing of segments, suppresses silent-region hallucinations, and can output subtitle formats. It works with any Whisper model size on CPU or GPU and is useful for subtitling, captioning, and audio editing. Open-source Python package.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Price
Free
Platform
Local/Desktop
Difficulty
Easy (2/5)
License
MIT
Minimum VRAM
4 GB
Added
Apr 3, 2026

Related Tools

Fast music generation model producing full songs with lyrics in seconds.

Open SourceSelf HostedOfflineGPU 8GB+
Intermediate
0.0 (0)

Open-source toolkit for audio, music, and speech generation research.

Open SourceSelf HostedOfflineGPU 8GB+
Advanced
0.0 (0)

Original latent diffusion model for text-to-audio generation.

Open SourceSelf HostedOfflineGPU 8GB+
Intermediate
0.0 (0)
Featured

State-of-the-art music source separation model by Meta for splitting tracks.

Open SourceSelf HostedOffline
Easy
0.0 (0)

High-fidelity neural audio codec by Meta for audio compression and tokenization.

Open SourceSelf HostedOffline
Intermediate
0.0 (0)

Updated music generation model with improved quality and longer generation.

Open SourceSelf HostedOfflineGPU 8GB+
Intermediate
0.0 (0)
Browse all Music & Audio Generation tools