Latte

Latent diffusion transformer for video generation with spatial-temporal attention.

Open SourceSelf HostedOffline CapableGPU Required (16GB+ VRAM)
0.0 (0)

About

Latte is a latent diffusion transformer for video generation that factorizes spatial and temporal modeling into dedicated attention blocks, reaching competitive quality with efficient training. The repository provides PyTorch model definitions, pretrained weights, and training, sampling, and evaluation code from the paper. Developed by Monash University and Shanghai AI Lab. Released under the Apache 2.0 license.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Price
Free
Platform
Local/Desktop
Difficulty
Advanced (4/5)
License
Apache-2.0
Minimum VRAM
16 GB
Added
Apr 3, 2026

Related Tools

Text-to-video generation framework with cascaded latent diffusion.

Open SourceSelf HostedOfflineGPU 16GB+
Advanced
0.0 (0)

Turn text-to-image models into animation generators

Open SourceSelf HostedOfflineGPU 12GB+
Advanced
0.0 (0)

AI animation tool for creating dynamic video content

Open SourceSelf HostedOfflineGPU 8GB+
Advanced
0.0 (0)

Latest Open-Sora release with improved video generation quality.

Open SourceSelf HostedOfflineGPU 16GB+
Advanced
0.0 (0)
Featured

AI video generation model from Stability AI

Open SourceSelf HostedOfflineGPU 16GB+
Advanced
0.0 (0)

Updated CogVideo model by Zhipu AI with improved video quality.

Open SourceSelf HostedOfflineGPU 16GB+
Advanced
0.0 (0)
Browse all Video Generation tools