Tabula
Tool for extracting tables from PDF files into CSV or DataFrame format.
About
Tabula is a tool for extracting data tables trapped inside PDF files, available as the tabula-java library, a command-line tool, and a web GUI. It exports tables to CSV, TSV, or JSON and is widely used by journalists and data analysts. The project is volunteer-run; the end-user application sees infrequent updates while tabula-java still receives occasional bug-fix releases. Released under the MIT license.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Category
- OCR & Document Processing
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Beginner (1/5)
- License
- MIT
- Added
- Apr 3, 2026
Related Tools
Python library for extracting text, tables, and metadata from PDFs.
Python bindings for MuPDF library for fast PDF text and image extraction.
Python library for extracting tables from PDF files.
Ready-to-use OCR library supporting 80+ languages with simple Python API.
Turn-key OCR system for historical and non-Latin script documents.
Vision-language model based OCR toolkit by AI2 for document understanding.