Tools/LLM Inference & Serving/LocalAI

LocalAI

Drop-in OpenAI-compatible API server for running LLMs, image, and audio models locally.

Open SourceSelf HostedOffline Capable

0.0 (0)

Visit Website View on GitHub

About

LocalAI is a self-hosted inference server that exposes an OpenAI-compatible API for running language, vision, voice, image, and video models on local hardware, including CPU-only setups. It loads several model formats such as GGUF and provides drop-in replacements for chat, embeddings, image, and audio endpoints. It ships as a Docker image and is created by Ettore Di Giacinto. Released under the MIT license.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Category: LLM Inference & Serving
Price: Free
Platform: Local/Desktop
Difficulty: Easy (2/5)
License: MIT
Added: Apr 3, 2026

Tags

inference openai-compatible api self-hosted docker multi-modal

Related Tools

Featured

llama.cpp

LLM Inference & Serving

Port of Meta's LLaMA model in C/C++ for efficient CPU inference

Open SourceSelf HostedOffline

Intermediate

0.0 (0)

Featured

vLLM

LLM Inference & Serving

High-throughput LLM serving engine with PagedAttention

Open SourceSelf HostedOfflineGPU 16GB+

Intermediate

0.0 (0)

Candle

LLM Inference & Serving

Minimalist ML framework in Rust by Hugging Face for fast inference.

Open SourceSelf HostedOffline

Advanced

0.0 (0)

ExLlamaV2

LLM Inference & Serving

Optimized inference library for running quantized LLMs on consumer GPUs.

Open SourceSelf HostedOfflineGPU 6GB+

Intermediate

0.0 (0)

Jan

LLM Inference & Serving

Open-source ChatGPT alternative that runs 100% offline on your computer.

Open SourceSelf HostedOffline

Beginner

0.0 (0)

PowerInfer

LLM Inference & Serving

Fast LLM inference on consumer GPUs using neuron-aware sparse computation.

Open SourceSelf HostedOfflineGPU 4GB+

Advanced

0.0 (0)

Browse all LLM Inference & Serving tools