Candle

Minimalist ML framework in Rust by Hugging Face for fast inference.

Open SourceSelf HostedOffline Capable

0.0 (0)

About

Written in Rust by Hugging Face, Candle targets machine learning workloads where binary size, startup time, and runtime performance matter more than a full research stack. The framework offers a PyTorch-like API for defining and running models, supports training as well as inference, and runs on CPU with optional MKL or Accelerate acceleration, on NVIDIA GPUs through CUDA with NCCL-based multi-GPU support, and in the browser through WebAssembly. Ready-made examples cover Llama, Mistral, Whisper, Stable Diffusion, Segment Anything, and many other language, vision, and audio models, and quantized weights in llama.cpp compatible formats are supported alongside safetensors and PyTorch checkpoints. Because no Python interpreter sits in the loop, Candle fits serverless functions, edge deployments, and embedded services where cold-start time and memory footprint are constrained. The project is dual licensed under MIT and Apache 2.0 and attracts Rust developers and ML engineers who need lean, dependency-light inference.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Candle

About

Reviews (0)

Leave a Review

Details

Tags

Related Tools

Jan

llama.cpp

PowerInfer

vLLM

Kobold.cpp

Candle