TabbyAPI

Fast ExLlamaV2-based OpenAI-compatible API server for quantized models.

Open SourceSelf HostedOffline CapableGPU Required (6GB+ VRAM)
0.0 (0)

About

TabbyAPI is an OpenAI-compatible API server built on ExLlamaV2 for serving EXL2 and GPTQ quantized models. Fast inference on consumer GPUs. Supports streaming, function calling, and multi-user. AGPL-3.0 license.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Price
Free
Platform
Local/Desktop
Difficulty
Easy (2/5)
License
AGPL-3.0
Minimum VRAM
6 GB
Added
Apr 3, 2026

Similar Tools

Featured

Desktop application for discovering, downloading, and running local LLMs.

Self HostedOffline
Beginner
0.0 (0)

Open-source ChatGPT alternative that runs 100% offline on your computer.

Open SourceSelf HostedOffline
Beginner
0.0 (0)

Open-source ecosystem for running LLMs locally on consumer hardware.

Open SourceSelf HostedOffline
Beginner
0.0 (0)