PowerInfer
Fast LLM inference on consumer GPUs using neuron-aware sparse computation.
Open SourceSelf HostedOffline CapableGPU Required (4GB+ VRAM)
0.0 (0)
About
PowerInfer by SJTU achieves fast LLM inference on consumer GPUs by exploiting neuron activation locality. Hot neurons stay on GPU, cold neurons on CPU. Up to 11x faster than llama.cpp for large models on limited VRAM. MIT license.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Category
- LLM Inference & Serving
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Advanced (4/5)
- License
- MIT
- Minimum VRAM
- 4 GB
- Added
- Apr 3, 2026
Similar Tools
Featured
Desktop application for discovering, downloading, and running local LLMs.
Self HostedOffline
Beginner
0.0 (0)
Open-source ChatGPT alternative that runs 100% offline on your computer.
Open SourceSelf HostedOffline
Beginner
0.0 (0)
Open-source ecosystem for running LLMs locally on consumer hardware.
Open SourceSelf HostedOffline
Beginner
0.0 (0)