Text-to-Speech (TTS) AI Tools
Open-source text-to-speech engines and models for generating natural-sounding speech from text input.
Open-source text-to-speech engines and models for generating natural-sounding speech from text input.
Few-shot voice cloning and TTS combining GPT and SoVITS architectures.
Lightweight and expressive TTS model with 82M parameters for fast local inference.
Transformer-based text-to-audio model by Suno that generates speech, music, and sound effects.
Multilingual TTS with zero-shot voice cloning and streaming support.
Large-scale multilingual TTS model by Alibaba with zero-shot voice cloning.
Conversational TTS model optimized for dialogue and chat applications.
High-quality multilingual TTS library by MyShell with fast CPU inference.
Privacy-focused neural TTS engine by Mycroft AI for offline voice assistants.
Compact open-source speech synthesizer supporting 100+ languages.
University of Edinburgh speech synthesis system with decades of research behind it.
Multi-voice TTS system with emphasis on quality over speed.
Cross-lingual neural codec language model for speech synthesis.
Real-time voice cloning and TTS model with 1.2B parameters by MetaVoice.
Instant voice cloning TTS by MyShell requiring only a short audio reference.
Fast TTS with conditional flow matching for efficient speech synthesis.
Zero-shot speech editing and TTS using neural codec language models.
Text-to-speech system built on top of Whisper encoder representations.
Cross-platform TTS using ONNX Runtime for on-device speech synthesis.
Emotion-controllable TTS engine by NetEase with 2000+ voices.
Pure language modeling approach to TTS without traditional audio codecs.
Expressive zero-shot TTS model by Resemble AI with emotion and accent control.
TTS model that generates speech from text descriptions of the desired voice.
Zero-shot TTS model with high naturalness and speaker similarity.
Singing voice conversion model based on VITS and SoftVC for voice-to-voice transfer.
Neural TTS model by Microsoft with expressive speech synthesis.
Lightweight neural TTS model optimized for edge and mobile deployment.
Compact variant of Fish Speech optimized for faster inference.
Open-source TTS model by Resemble AI with emotion and accent control.
Open-source dialogue TTS model by Nari Labs supporting multi-speaker conversations.
Deep learning TTS library by Mozilla with Tacotron and WaveRNN implementations.
Style diffusion and adversarial training for human-level TTS with style transfer.