Tools/Video Generation/ModelScope Text-to-Video

ModelScope Text-to-Video

Text-to-video generation model by Alibaba DAMO Academy on ModelScope.

Open SourceSelf HostedOffline CapableGPU Required (12GB+ VRAM)

0.0 (0)

Visit Website View on GitHub

About

ModelScope Text-to-Video is an early open text-to-video diffusion model from Alibaba DAMO Academy, distributed through the ModelScope model-as-a-service library. It generates short clips from English text prompts and was one of the first openly released models of its kind. The core ModelScope library provides unified interfaces for loading and running the model. Inference benefits from a GPU with 12 GB or more of VRAM.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Category: Video Generation
Price: Free
Platform: Local/Desktop
Difficulty: Advanced (4/5)
Minimum VRAM: 12 GB
Added: Apr 3, 2026

Tags

video-generation text-to-video alibaba modelscope

Related Tools

Featured

HunyuanVideo

Video Generation

Open-source video generation model by Tencent with text and image conditioning.

Open SourceSelf HostedOfflineGPU 24GB+

Advanced

0.0 (0)

I2VGen-XL

Video Generation

Image-to-video generation model by Alibaba DAMO Academy.

Open SourceSelf HostedOfflineGPU 12GB+

Advanced

0.0 (0)

CogVideo 1.5

Video Generation

Updated CogVideo model by Zhipu AI with improved video quality.

Open SourceSelf HostedOfflineGPU 16GB+

Advanced

0.0 (0)

MuseV

Video Generation

Infinite-length music-driven video generation with visual conditioning.

Open SourceSelf HostedOfflineGPU 12GB+

Advanced

0.0 (0)

LaVie

Video Generation

Text-to-video generation framework with cascaded latent diffusion.

Open SourceSelf HostedOfflineGPU 16GB+

Advanced

0.0 (0)

CogVideoX

Video Generation

Open-source text-to-video model by Zhipu AI/Tsinghua with 2B and 5B variants.

Open SourceSelf HostedOfflineGPU 12GB+

Advanced

0.0 (0)

Browse all Video Generation tools