MuseV
Infinite-length music-driven video generation with visual conditioning.
About
MuseV by Tencent is a diffusion-based model for generating long, even effectively unbounded, virtual-human videos through parallel denoising conditioned on music, images, and text. It is part of the Muse open-source series alongside MuseTalk and MusePose and targets high-fidelity talking and performing avatars. Inference benefits from a GPU with 12 GB or more of VRAM. Distributed as open-source research.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Category
- Video Generation
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Advanced (4/5)
- Minimum VRAM
- 12 GB
- Added
- Apr 3, 2026
Related Tools
Text-to-video generation framework with cascaded latent diffusion.
Turn text-to-image models into animation generators
AI animation tool for creating dynamic video content
Latest Open-Sora release with improved video generation quality.
AI video generation model from Stability AI
Updated CogVideo model by Zhipu AI with improved video quality.