Llamafile
Single-file executable LLMs by Mozilla that run on any OS without installation.
About
Llamafile by Mozilla packages a language model together with llama.cpp into a single executable that runs on Windows, macOS, Linux, and other systems without installation, by combining llama.cpp with Cosmopolitan Libc. You download one file and run it to get a local chat server and API. The goal is to make open models easy to distribute and run. Released under the Apache 2.0 license.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Category
- LLM Inference & Serving
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Beginner (1/5)
- License
- Apache-2.0
- Added
- Apr 3, 2026
Related Tools
Port of Meta's LLaMA model in C/C++ for efficient CPU inference
High-throughput LLM serving engine with PagedAttention
Minimalist ML framework in Rust by Hugging Face for fast inference.
Optimized inference library for running quantized LLMs on consumer GPUs.
Open-source ChatGPT alternative that runs 100% offline on your computer.
Fast LLM inference on consumer GPUs using neuron-aware sparse computation.