Latte
Latent diffusion transformer for video generation with spatial-temporal attention.
About
Latte is a latent diffusion transformer for video generation that factorizes spatial and temporal modeling into dedicated attention blocks, reaching competitive quality with efficient training. The repository provides PyTorch model definitions, pretrained weights, and training, sampling, and evaluation code from the paper. Developed by Monash University and Shanghai AI Lab. Released under the Apache 2.0 license.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Category
- Video Generation
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Advanced (4/5)
- License
- Apache-2.0
- Minimum VRAM
- 16 GB
- Added
- Apr 3, 2026
Related Tools
Text-to-video generation framework with cascaded latent diffusion.
Turn text-to-image models into animation generators
AI animation tool for creating dynamic video content
Latest Open-Sora release with improved video generation quality.
AI video generation model from Stability AI
Updated CogVideo model by Zhipu AI with improved video quality.