LocalAI
Drop-in OpenAI-compatible API server for running LLMs, image, and audio models locally.
About
LocalAI is a self-hosted inference server that exposes an OpenAI-compatible API for running language, vision, voice, image, and video models on local hardware, including CPU-only setups. It loads several model formats such as GGUF and provides drop-in replacements for chat, embeddings, image, and audio endpoints. It ships as a Docker image and is created by Ettore Di Giacinto. Released under the MIT license.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Category
- LLM Inference & Serving
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Easy (2/5)
- License
- MIT
- Added
- Apr 3, 2026
Related Tools
Port of Meta's LLaMA model in C/C++ for efficient CPU inference
High-throughput LLM serving engine with PagedAttention
Minimalist ML framework in Rust by Hugging Face for fast inference.
Optimized inference library for running quantized LLMs on consumer GPUs.
Open-source ChatGPT alternative that runs 100% offline on your computer.
Fast LLM inference on consumer GPUs using neuron-aware sparse computation.