Candle
Minimalist ML framework in Rust by Hugging Face for fast inference.
About
Candle by Hugging Face is a minimalist machine learning framework written in Rust with a focus on performance and ease of use, including GPU support. It targets inference workloads where a small binary and fast startup matter, and ships examples for Llama, Mistral, Whisper, Stable Diffusion, and Segment Anything across CPU and CUDA backends. Dual-licensed under MIT and Apache 2.0.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Category
- LLM Inference & Serving
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Advanced (4/5)
- License
- MIT
- Added
- Apr 3, 2026
Related Tools
Port of Meta's LLaMA model in C/C++ for efficient CPU inference
High-throughput LLM serving engine with PagedAttention
Optimized inference library for running quantized LLMs on consumer GPUs.
Open-source ChatGPT alternative that runs 100% offline on your computer.
Hugging Face's high-performance text generation server
Fast LLM inference on consumer GPUs using neuron-aware sparse computation.