PowerInfer

Fast LLM inference on consumer GPUs using neuron-aware sparse computation.

Open SourceSelf HostedOffline CapableGPU Required (4GB+ VRAM)
0.0 (0)

About

PowerInfer by SJTU achieves fast LLM inference on consumer GPUs by exploiting neuron activation locality. Hot neurons stay on GPU, cold neurons on CPU. Up to 11x faster than llama.cpp for large models on limited VRAM. MIT license.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Price
Free
Platform
Local/Desktop
Difficulty
Advanced (4/5)
License
MIT
Minimum VRAM
4 GB
Added
Apr 3, 2026

Similar Tools

Featured

Desktop application for discovering, downloading, and running local LLMs.

Self HostedOffline
Beginner
0.0 (0)

Open-source ChatGPT alternative that runs 100% offline on your computer.

Open SourceSelf HostedOffline
Beginner
0.0 (0)

Open-source ecosystem for running LLMs locally on consumer hardware.

Open SourceSelf HostedOffline
Beginner
0.0 (0)