Tools/Text-to-Speech (TTS)/VibeVoice

VibeVoice

Neural TTS model by Microsoft with expressive speech synthesis.

Open SourceSelf HostedOffline CapableGPU Required (6GB+ VRAM)

0.0 (0)

Visit Website View on GitHub

About

VibeVoice by Microsoft Research is a neural text-to-speech model aimed at expressive, natural synthesis with controllable prosody and emotion. The project later added VibeVoice-ASR, a unified speech-to-text model available through the Hugging Face Transformers library. Inference runs on a GPU and demo notebooks are provided. Released under the MIT license with model collections published on Hugging Face.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Category: Text-to-Speech (TTS)
Price: Free
Platform: Local/Desktop
Difficulty: Intermediate (3/5)
License: MIT
Minimum VRAM: 6 GB
Added: Apr 3, 2026

Tags

tts expressive microsoft prosody emotional

Related Tools

Featured

Kokoro TTS

Text-to-Speech (TTS)

Lightweight and expressive TTS model with 82M parameters for fast local inference.

Open SourceSelf HostedOffline

Easy

4.0 (1)

ChatTTS

Text-to-Speech (TTS)

Conversational TTS model optimized for dialogue and chat applications.

Open SourceSelf HostedOfflineGPU 4GB+

Intermediate

0.0 (0)

CosyVoice

Text-to-Speech (TTS)

Multilingual large voice generation model with full-stack inference, training, and deployment.

Open SourceSelf HostedOfflineGPU

Intermediate

0.0 (0)

CosyVoice 2

Text-to-Speech (TTS)

Large-scale multilingual TTS model by Alibaba with zero-shot voice cloning.

Open SourceSelf HostedOfflineGPU 8GB+

Advanced

0.0 (0)

EmotiVoice

Text-to-Speech (TTS)

Emotion-controllable TTS engine by NetEase with 2000+ voices.

Open SourceSelf HostedOfflineGPU 4GB+

Intermediate

0.0 (0)

Featured

Bark

Text-to-Speech (TTS)

Transformer-based text-to-audio model by Suno that generates speech, music, and sound effects.

Open SourceSelf HostedOfflineGPU 4GB+

Intermediate

0.0 (0)

Browse all Text-to-Speech (TTS) tools