Tools/Video Generation/Latte

Latte

Latent diffusion transformer for video generation with spatial-temporal attention.

Open SourceSelf HostedOffline CapableGPU Required (16GB+ VRAM)

0.0 (0)

Visit Website View on GitHub

About

Latte is a latent diffusion transformer for video generation that factorizes spatial and temporal modeling into dedicated attention blocks, reaching competitive quality with efficient training. The repository provides PyTorch model definitions, pretrained weights, and training, sampling, and evaluation code from the paper. Developed by Monash University and Shanghai AI Lab. Released under the Apache 2.0 license.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Category: Video Generation
Price: Free
Platform: Local/Desktop
Difficulty: Advanced (4/5)
License: Apache-2.0
Minimum VRAM: 16 GB
Added: Apr 3, 2026

Tags

video-generation diffusion transformer spatial-temporal research

Related Tools

Featured

HunyuanVideo

Video Generation

Open-source video generation model by Tencent with text and image conditioning.

Open SourceSelf HostedOfflineGPU 24GB+

Advanced

0.0 (0)

I2VGen-XL

Video Generation

Image-to-video generation model by Alibaba DAMO Academy.

Open SourceSelf HostedOfflineGPU 12GB+

Advanced

0.0 (0)

CogVideo 1.5

Video Generation

Updated CogVideo model by Zhipu AI with improved video quality.

Open SourceSelf HostedOfflineGPU 16GB+

Advanced

0.0 (0)

MuseV

Video Generation

Infinite-length music-driven video generation with visual conditioning.

Open SourceSelf HostedOfflineGPU 12GB+

Advanced

0.0 (0)

LaVie

Video Generation

Text-to-video generation framework with cascaded latent diffusion.

Open SourceSelf HostedOfflineGPU 16GB+

Advanced

0.0 (0)

CogVideoX

Video Generation

Open-source text-to-video model by Zhipu AI/Tsinghua with 2B and 5B variants.

Open SourceSelf HostedOfflineGPU 12GB+

Advanced

0.0 (0)

Browse all Video Generation tools