Kraken
Turn-key OCR system for historical and non-Latin script documents.
About
Kraken is a turn-key OCR system optimized for historical and non-Latin script material, common in academic and digitization projects. It handles binarization, layout analysis, and recognition, and ships models for scripts including Arabic, Hebrew, and Greek. It installs through pip on Linux and macOS, including ARM, and supports training custom models. Released under the Apache 2.0 license.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Category
- OCR & Document Processing
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Intermediate (3/5)
- License
- Apache-2.0
- Added
- Apr 3, 2026
Related Tools
Python library for extracting text, tables, and metadata from PDFs.
Python bindings for MuPDF library for fast PDF text and image extraction.
Python library for extracting tables from PDF files.
Ready-to-use OCR library supporting 80+ languages with simple Python API.
One-stop tool for high-quality PDF extraction to Markdown or JSON.
Vision-language model based OCR toolkit by AI2 for document understanding.