Whisper Timestamped

Whisper extension providing word-level timestamps for transcription.

Open SourceSelf HostedOffline CapableGPU Required (4GB+ VRAM)
0.0 (0)

About

whisper-timestamped extends OpenAI Whisper with per-word timestamps and per-word confidence scores using dynamic time warping over the model's cross-attention weights. It is compatible with any openai-whisper version, runs the same models, and adds heuristics to suppress common hallucinations on near-silent segments. Useful for subtitle generation, karaoke alignment, and audio editing workflows. Released under the AGPL-3.0 license.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Price
Free
Platform
Local/Desktop
Difficulty
Easy (2/5)
License
LGPL-3.0
Minimum VRAM
4 GB
Added
Apr 3, 2026

Related Tools

Multilingual ASR model by NVIDIA supporting 4 languages with translation.

Open SourceSelf HostedOfflineGPU 8GB+
Intermediate
0.0 (0)

Convolution-augmented transformer for speech recognition in ESPnet toolkit.

Open SourceSelf HostedOfflineGPU 8GB+
Advanced
0.0 (0)

Pre-trained speech models for STT, TTS, and VAD with simple PyTorch integration.

Open SourceSelf HostedOffline
Beginner
0.0 (0)

CLI tool that transcribes audio 10x faster using pipeline optimizations.

Open SourceSelf HostedOfflineGPU 6GB+
Easy
0.0 (0)

Self-supervised speech representation model by Meta for ASR.

Open SourceSelf HostedOfflineGPU 8GB+
Advanced
0.0 (0)

Open-source speaker diarization and voice activity detection toolkit.

Open SourceSelf HostedOfflineGPU 4GB+
Intermediate
0.0 (0)
Browse all Speech-to-Text / Speech Recognition tools