Candle

Minimalist machine learning framework for Rust focused on performance and serverless inference.

Open SourceSelf HostedOffline Capable

0.0 (0)

About

Candle is a minimalist machine learning framework written in Rust and maintained by Hugging Face, built to make serverless inference practical by removing the heavyweight Python dependencies of traditional frameworks. Its API is designed to feel familiar to PyTorch users while compiling to small, fast-starting binaries. The framework supports optimized CPU execution with optional MKL and Accelerate backends, CUDA with multi-GPU distribution via NCCL, and WebAssembly for running models directly in the browser. The repository ships working implementations of many models, including LLaMA, Mistral, Phi, Gemma, and Qwen language models, Stable Diffusion for image generation, Whisper for speech recognition, and YOLO and Segment Anything for vision. Weights load from safetensors, NPZ, GGML, and PyTorch files, including llama.cpp compatible quantized formats, and custom kernels such as FlashAttention v2 can be plugged in. Dual licensed under MIT and Apache 2.0, Candle suits developers who want production or edge inference without a Python runtime.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Candle

About

Reviews (0)

Leave a Review

Details

Tags

Related Tools

Jan

llama.cpp

PowerInfer

vLLM

Kobold.cpp

Candle