Candle

Minimalist ML framework in Rust by Hugging Face for fast inference.

Open SourceSelf HostedOffline Capable
0.0 (0)

About

Candle by Hugging Face is a minimalist machine learning framework written in Rust with a focus on performance and ease of use, including GPU support. It targets inference workloads where a small binary and fast startup matter, and ships examples for Llama, Mistral, Whisper, Stable Diffusion, and Segment Anything across CPU and CUDA backends. Dual-licensed under MIT and Apache 2.0.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Price
Free
Platform
Local/Desktop
Difficulty
Advanced (4/5)
License
MIT
Added
Apr 3, 2026

Related Tools

Featured

Port of Meta's LLaMA model in C/C++ for efficient CPU inference

Open SourceSelf HostedOffline
Intermediate
0.0 (0)
Featured

High-throughput LLM serving engine with PagedAttention

Open SourceSelf HostedOfflineGPU 16GB+
Intermediate
0.0 (0)

Optimized inference library for running quantized LLMs on consumer GPUs.

Open SourceSelf HostedOfflineGPU 6GB+
Intermediate
0.0 (0)

Open-source ChatGPT alternative that runs 100% offline on your computer.

Open SourceSelf HostedOffline
Beginner
0.0 (0)

Hugging Face's high-performance text generation server

Open SourceSelf HostedOfflineGPU 16GB+
Advanced
0.0 (0)

Fast LLM inference on consumer GPUs using neuron-aware sparse computation.

Open SourceSelf HostedOfflineGPU 4GB+
Advanced
0.0 (0)
Browse all LLM Inference & Serving tools