SGLang
Fast serving framework for LLMs with structured generation and RadixAttention.
Open SourceSelf HostedOffline CapableGPU Required (8GB+ VRAM)
0.0 (0)
About
SGLang is a fast serving framework for large language models by LMSYS. Features RadixAttention for efficient KV cache reuse, constrained decoding (JSON, regex), and multi-modal support. Competitive throughput with vLLM. Apache 2.0 license.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Category
- LLM Inference & Serving
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Intermediate (3/5)
- License
- Apache-2.0
- Minimum VRAM
- 8 GB
- Added
- Apr 3, 2026
Similar Tools
Featured
Desktop application for discovering, downloading, and running local LLMs.
Self HostedOffline
Beginner
0.0 (0)
Open-source ChatGPT alternative that runs 100% offline on your computer.
Open SourceSelf HostedOffline
Beginner
0.0 (0)
Open-source ecosystem for running LLMs locally on consumer hardware.
Open SourceSelf HostedOffline
Beginner
0.0 (0)