Tools/Speech-to-Text / Speech Recognition/DeepSpeech

DeepSpeech

End-to-end speech recognition engine by Mozilla using TensorFlow.

Open SourceSelf HostedOffline Capable

0.0 (0)

Visit Website View on GitHub

About

Mozilla's DeepSpeech is an open-source speech-to-text engine based on the architecture described in Baidu's Deep Speech research paper and implemented in TensorFlow. It converts audio to text entirely on-device with no cloud dependency, supports real-time streaming transcription, and runs on hardware ranging from a Raspberry Pi 4 to GPU servers, which made it a popular choice for privacy-preserving and embedded voice applications. Pretrained English models were distributed alongside tooling for training custom models on new languages and acoustic conditions. Development has ended: Mozilla archived the repository in June 2025, making it read-only, so no further updates or fixes should be expected, though the code remains usable under the MPL 2.0 license. It is still referenced by developers studying end-to-end speech recognition or maintaining legacy offline transcription systems.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Category: Speech-to-Text / Speech Recognition
Price: Free
Platform: Local/Desktop
Difficulty: Intermediate (3/5)
License: MPL-2.0
Added: Apr 3, 2026

Tags

stt asr mozilla tensorflow streaming

Related Tools

Conformer (ESPnet)

Speech-to-Text / Speech Recognition

Convolution-augmented transformer for speech recognition in ESPnet toolkit.

Open SourceSelf HostedOfflineGPU 8GB+

Advanced

0.0 (0)

ESPnet

Speech-to-Text / Speech Recognition

End-to-end speech processing toolkit covering ASR, TTS, and speech translation.

Open SourceSelf HostedOfflineGPU 8GB+

Expert

0.0 (0)

Insanely Fast Whisper

Speech-to-Text / Speech Recognition

CLI tool that transcribes audio 10x faster using pipeline optimizations.

Open SourceSelf HostedOfflineGPU 6GB+

Easy

0.0 (0)

Kaldi

Speech-to-Text / Speech Recognition

Established speech recognition toolkit used in research and production systems.

Open SourceSelf HostedOffline

Expert

0.0 (0)

Wav2Vec 2.0

Speech-to-Text / Speech Recognition

Self-supervised speech representation model by Meta for ASR.

Open SourceSelf HostedOfflineGPU 8GB+

Advanced

0.0 (0)

Canary (NVIDIA NeMo)

Speech-to-Text / Speech Recognition

Multilingual ASR model by NVIDIA supporting 4 languages with translation.

Open SourceSelf HostedOfflineGPU 8GB+

Intermediate

0.0 (0)

Browse all Speech-to-Text / Speech Recognition tools