PyMuPDF
Python bindings for MuPDF library for fast PDF text and image extraction.
About
PyMuPDF provides Python bindings for the MuPDF engine for fast extraction of text, images, and metadata from PDFs, plus rendering pages to images and converting and manipulating documents. It is high-performance and memory-efficient and is widely used in AI data pipelines, with a companion helper aimed at producing LLM-ready Markdown. Released under the AGPL-3.0 license, with a commercial license available.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Category
- OCR & Document Processing
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Beginner (1/5)
- License
- AGPL-3.0
- Added
- Apr 3, 2026
Related Tools
Python library for extracting text, tables, and metadata from PDFs.
Python library for extracting tables from PDF files.
Ready-to-use OCR library supporting 80+ languages with simple Python API.
Turn-key OCR system for historical and non-Latin script documents.
One-stop tool for high-quality PDF extraction to Markdown or JSON.
Vision-language model based OCR toolkit by AI2 for document understanding.