ESPnet
End-to-end speech processing toolkit covering ASR, TTS, and speech translation.
Open SourceSelf HostedOffline CapableGPU Required (8GB+ VRAM)
0.0 (0)
About
ESPnet is an end-to-end speech processing toolkit supporting ASR, TTS, speech translation, speech enhancement, and more. Includes Conformer, Transformer, and other architectures. Widely used in research. Developed by Johns Hopkins, CMU, and others. Apache 2.0 license.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Expert (5/5)
- License
- Apache-2.0
- Minimum VRAM
- 8 GB
- Added
- Apr 3, 2026
Similar Tools
Featured
General-purpose speech recognition model by OpenAI trained on 680K hours of multilingual audio.
Open SourceSelf HostedOffline
Easy
0.0 (0)
Featured
High-performance C/C++ port of Whisper for CPU-based speech recognition.
Open SourceSelf HostedOffline
Easy
0.0 (0)
Offline speech recognition toolkit supporting 20+ languages with small models.
Open SourceSelf HostedOffline
Easy
0.0 (0)