Tools/Voice Cloning & Voice Conversion/CosyVoice (Zero-Shot Cloning)

CosyVoice (Zero-Shot Cloning)

Zero-shot voice cloning mode of CosyVoice model by Alibaba.

Open SourceSelf HostedOffline CapableGPU Required (8GB+ VRAM)
0.0 (0)

About

Fun-CosyVoice 3.0 is the latest open text-to-speech model from Alibaba's FunAudio team, built on a large language model backbone and aimed at zero-shot voice cloning in the wild. It accepts a short reference clip and generates speech that follows the target voice, language, and prosody. The repository ships multiple checkpoints (Fun-CosyVoice3, CosyVoice2, CosyVoice-300M variants), a Hugging Face demo, and vLLM-accelerated serving.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Price
Free
Platform
Local/Desktop
Difficulty
Intermediate (3/5)
License
Apache-2.0
Minimum VRAM
8 GB
Added
Apr 3, 2026

Related Tools

Community fork of RVC with additional features and optimizations.

Open SourceSelf HostedOfflineGPU 4GB+
Easy
0.0 (0)

Updated zero-shot voice cloning by MyShell with improved quality.

Open SourceSelf HostedOfflineGPU 4GB+
Easy
0.0 (0)
Featured

Voice cloning mode of XTTS-v2 for creating custom voice replicas.

Open SourceSelf HostedOfflineGPU 4GB+
Easy
0.0 (0)

Zero-shot voice cloning capabilities of Fish Speech model.

Open SourceSelf HostedOfflineGPU 6GB+
Easy
0.0 (0)
Featured

Easy-to-use voice conversion framework based on retrieval for real-time voice cloning.

Open SourceSelf HostedOfflineGPU 4GB+
Easy
0.0 (0)

Text-free one-shot voice conversion model requiring no text transcription.

Open SourceSelf HostedOfflineGPU 4GB+
Intermediate
0.0 (0)
Browse all Voice Cloning & Voice Conversion tools