Silero Models

Pre-trained speech models for STT, TTS, and VAD with simple PyTorch integration.

Open SourceSelf HostedOffline Capable
0.0 (0)

About

Silero Models is a set of pretrained speech models for speech-to-text, text-to-speech, and voice activity detection that load through PyTorch Hub or a pip package in a single line. The models cover several languages, are designed for production use, and stay lightweight and fast on CPU. The available models are listed in a manifest file. Released under the MIT license.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Price
Free
Platform
Local/Desktop
Difficulty
Beginner (1/5)
License
MIT
Added
Apr 3, 2026

Related Tools

Whisper extension providing word-level timestamps for transcription.

Open SourceSelf HostedOfflineGPU 4GB+
Easy
0.0 (0)

Multilingual ASR model by NVIDIA supporting 4 languages with translation.

Open SourceSelf HostedOfflineGPU 8GB+
Intermediate
0.0 (0)

Convolution-augmented transformer for speech recognition in ESPnet toolkit.

Open SourceSelf HostedOfflineGPU 8GB+
Advanced
0.0 (0)

CLI tool that transcribes audio 10x faster using pipeline optimizations.

Open SourceSelf HostedOfflineGPU 6GB+
Easy
0.0 (0)

Self-supervised speech representation model by Meta for ASR.

Open SourceSelf HostedOfflineGPU 8GB+
Advanced
0.0 (0)

Open-source speaker diarization and voice activity detection toolkit.

Open SourceSelf HostedOfflineGPU 4GB+
Intermediate
0.0 (0)
Browse all Speech-to-Text / Speech Recognition tools