Tools/Speech-to-Text / Speech Recognition/Silero Models

Silero Models

Pre-trained speech models for STT, TTS, and VAD with simple PyTorch integration.

Open SourceSelf HostedOffline Capable

0.0 (0)

About

Silero Models bundles production-oriented pretrained speech models for speech-to-text, text-to-speech, and voice activity detection that load in a single line through PyTorch Hub or a pip package. The models are deliberately lightweight and run well on CPU, which makes them practical for servers and edge deployments without GPUs. Language coverage centers on Russian, English, Ukrainian, and other languages of the CIS region, including many minority languages, and the newer TTS generations extend to more than 20 languages with multiple speakers per language, SSML support, and, for Russian, automatic stress placement and homograph resolution. The TTS models are fully end-to-end, so no external vocoder or aligner is needed, and an accompanying manifest file lists all available checkpoints. Licensing varies by model: several base models are MIT licensed while many others carry a non-commercial Creative Commons license, so terms need checking per checkpoint. Typical users are developers who want dependable offline speech components without training anything themselves.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Category: Speech-to-Text / Speech Recognition
Price: Free
Platform: Local/Desktop
Difficulty: Beginner (1/5)
License: MIT
Added: Apr 3, 2026

Website GitHub

Browse all Speech-to-Text / Speech Recognition tools

Silero Models

About

Reviews (0)

Leave a Review

Details

Tags

Related Tools

Conformer (ESPnet)

ESPnet

Insanely Fast Whisper

Kaldi

Wav2Vec 2.0

Canary (NVIDIA NeMo)