Featured Tool

FlashAttention

IO-aware exact attention algorithm that is 2-4x faster and uses less memory.

Open SourceSelf HostedOffline CapableGPU Required (8GB+ VRAM)
0.0 (0)

About

FlashAttention by Tri Dao is an IO-aware exact attention algorithm that reduces memory usage from O(N^2) to O(N) while being 2-4x faster than standard attention. Critical optimization for training and inference of large transformer models. BSD-3-Clause license.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Price
Free
Platform
Local/Desktop
Difficulty
Expert (5/5)
License
BSD-3-Clause
Minimum VRAM
8 GB
Added
Apr 3, 2026

Similar Tools

Featured

Open-source machine learning framework by Meta with dynamic computation graphs.

Open SourceSelf HostedOffline
Intermediate
0.0 (0)
Featured

End-to-end open-source ML platform by Google for training and deployment.

Open SourceSelf HostedOffline
Intermediate
0.0 (0)

High-performance numerical computing library by Google with auto-differentiation.

Open SourceSelf HostedOffline
Advanced
0.0 (0)