Text-to-Speech (TTS) AI Tools
Open-source text-to-speech engines and models for generating natural-sounding speech from text input.
Open-source text-to-speech engines and models for generating natural-sounding speech from text input.
Lightweight and expressive TTS model with 82M parameters for fast local inference.
Transformer-based text-to-audio model by Suno that generates speech, music, and sound effects.
Few-shot voice cloning and TTS combining GPT and SoVITS architectures.
Singing voice conversion model based on VITS and SoftVC for voice-to-voice transfer.
Zero-shot TTS model with high naturalness and speaker similarity.
High-quality multilingual TTS library by MyShell with fast CPU inference.
Deep learning TTS library by Mozilla with Tacotron and WaveRNN implementations.
Style diffusion and adversarial training for human-level TTS with style transfer.
Multilingual TTS with zero-shot voice cloning and streaming support.
Large-scale multilingual TTS model by Alibaba with zero-shot voice cloning.
Conversational TTS model optimized for dialogue and chat applications.
Privacy-focused neural TTS engine by Mycroft AI for offline voice assistants.
Compact open-source speech synthesizer supporting 100+ languages.
University of Edinburgh speech synthesis system with decades of research behind it.
Multi-voice TTS system with emphasis on quality over speed.
Cross-lingual neural codec language model for speech synthesis.
Real-time voice cloning and TTS model with 1.2B parameters by MetaVoice.
Fast TTS with conditional flow matching for efficient speech synthesis.
Zero-shot speech editing and TTS using neural codec language models.
Text-to-speech system built on top of Whisper encoder representations.
Cross-platform TTS using ONNX Runtime for on-device speech synthesis.
Emotion-controllable TTS engine by NetEase with 2000+ voices.
Pure language modeling approach to TTS without traditional audio codecs.
TTS model that generates speech from text descriptions of the desired voice.
Neural TTS model by Microsoft with expressive speech synthesis.
Lightweight neural TTS model optimized for edge and mobile deployment.
Compact variant of Fish Speech optimized for faster inference.
Open-source dialogue TTS model by Nari Labs supporting multi-speaker conversations.
Diffusion transformer text-to-speech model using flow matching for fluent, faithful speech.
Multilingual large voice generation model with full-stack inference, training, and deployment.
Coqui deep learning toolkit for text-to-speech with pretrained models in 1100+ languages.
Instant voice cloning TTS by MyShell requiring only a short audio reference.
Speech-text foundation model for full-duplex real-time spoken dialogue with neural audio codec.
Open-source TTS model by Resemble AI with emotion and accent control.
Expressive zero-shot TTS model by Resemble AI with emotion and accent control.