pdfplumber
Python library for extracting text, tables, and metadata from PDFs.
About
pdfplumber is a Python library for extracting detailed text, tables, and metadata from machine-generated PDFs, built on pdfminer.six. It exposes character, rectangle, and line level position data, includes table extraction and visual debugging tools, and works without OCR on text-based files. It suits structured data extraction from reports and forms. Released under the MIT license.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Category
- OCR & Document Processing
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Beginner (1/5)
- License
- MIT
- Added
- Apr 3, 2026
Related Tools
Python bindings for MuPDF library for fast PDF text and image extraction.
Python library for extracting tables from PDF files.
Ready-to-use OCR library supporting 80+ languages with simple Python API.
Turn-key OCR system for historical and non-Latin script documents.
One-stop tool for high-quality PDF extraction to Markdown or JSON.
Vision-language model based OCR toolkit by AI2 for document understanding.