Hugging Face Tokenizers
Ultra-fast text tokenization library in Rust with Python bindings.
About
Hugging Face Tokenizers is a tokenization library with a Rust core and bindings for Python and other languages, built for speed and versatility. It implements Byte-Pair Encoding, WordPiece, and Unigram models, trains custom tokenizers in a couple of lines, and handles configurable pre- and post-processing, tokenizing about a gigabyte of text in under twenty seconds on a server CPU. Released under the Apache 2.0 license.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Category
- AI Frameworks & Libraries
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Easy (2/5)
- License
- Apache-2.0
- Added
- Apr 3, 2026
Related Tools
Tensor library for machine learning on commodity hardware
Structured output extraction from LLMs with Pydantic
Deploy LangChain runnables as REST APIs
Unified system for large-scale distributed training and inference.
High-level deep learning library making neural nets accessible with best practices.
Open-source machine learning framework by Meta with dynamic computation graphs.