OCRmyPDF
Adds searchable text layer to scanned PDFs using Tesseract OCR.
About
OCRmyPDF adds a searchable, selectable text layer to scanned PDFs using the Tesseract OCR engine, so existing PDFs become searchable without changing their appearance. It handles image preprocessing, deskewing, and file optimization, processes pages in parallel, and runs entirely on CPU as a command-line tool. It installs via pip or Homebrew. Released under the MPL-2.0 license.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Category
- OCR & Document Processing
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Beginner (1/5)
- License
- MPL-2.0
- Added
- Apr 3, 2026
Related Tools
Python library for extracting text, tables, and metadata from PDFs.
Python bindings for MuPDF library for fast PDF text and image extraction.
Python library for extracting tables from PDF files.
Ready-to-use OCR library supporting 80+ languages with simple Python API.
Turn-key OCR system for historical and non-Latin script documents.
Vision-language model based OCR toolkit by AI2 for document understanding.