Featured Tool

FlashAttention

IO-aware exact attention algorithm that is 2-4x faster and uses less memory.

Open SourceSelf HostedOffline CapableGPU Required (8GB+ VRAM)
0.0 (0)

About

FlashAttention by Tri Dao is an IO-aware exact attention algorithm that reduces memory usage from O(N^2) to O(N) while being 2-4x faster than standard attention. Critical optimization for training and inference of large transformer models. BSD-3-Clause license.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Price
Free
Platform
Local/Desktop
Difficulty
Expert (5/5)
License
BSD-3-Clause
Minimum VRAM
8 GB
Added
Apr 3, 2026

Related Tools

Tensor library for machine learning on commodity hardware

Open SourceSelf HostedOffline
Expert
0.0 (0)

Structured output extraction from LLMs with Pydantic

Open SourceSelf Hosted
Easy
0.0 (0)

Deploy LangChain runnables as REST APIs

Open SourceSelf Hosted
Easy
0.0 (0)

Unified system for large-scale distributed training and inference.

Open SourceSelf HostedOfflineGPU 8GB+
Advanced
0.0 (0)

High-level deep learning library making neural nets accessible with best practices.

Open SourceSelf HostedOfflineGPU 4GB+
Easy
0.0 (0)
Featured

Open-source machine learning framework by Meta with dynamic computation graphs.

Open SourceSelf HostedOffline
Intermediate
0.0 (0)
Browse all AI Frameworks & Libraries tools