Tools/OCR & Document Processing/Tabula

Tabula

Tool for extracting tables from PDF files into CSV or DataFrame format.

Open SourceSelf HostedOffline Capable

0.0 (0)

Visit Website View on GitHub

About

Tabula is a tool for extracting data tables trapped inside PDF files, available as the tabula-java library, a command-line tool, and a web GUI. It exports tables to CSV, TSV, or JSON and is widely used by journalists and data analysts. The project is volunteer-run; the end-user application sees infrequent updates while tabula-java still receives occasional bug-fix releases. Released under the MIT license.

Reviews (0)

Leave a Review

No reviews yet. Be the first to review!

Details

Category: OCR & Document Processing
Price: Free
Platform: Local/Desktop
Difficulty: Beginner (1/5)
License: MIT
Added: Apr 3, 2026

Tags

document pdf table-extraction csv java

Related Tools

Featured

Docling

OCR & Document Processing

Document parsing library by IBM for converting PDFs and documents to structured data.

Open SourceSelf HostedOffline

Easy

0.0 (0)

DocTR

OCR & Document Processing

Deep learning based OCR library in Python and TensorFlow/PyTorch.

Open SourceSelf HostedOffline

Easy

0.0 (0)

MinerU

OCR & Document Processing

One-stop tool for high-quality PDF extraction to Markdown or JSON.

Open SourceSelf HostedOffline

Easy

0.0 (0)

PyMuPDF

OCR & Document Processing

Python bindings for MuPDF library for fast PDF text and image extraction.

Open SourceSelf HostedOffline

Beginner

0.0 (0)

Featured

EasyOCR

OCR & Document Processing

Ready-to-use OCR library supporting 80+ languages with simple Python API.

Open SourceSelf HostedOffline

Beginner

0.0 (0)

Camelot

OCR & Document Processing

Python library for extracting tables from PDF files.

Open SourceSelf HostedOffline

Beginner

0.0 (0)

Browse all OCR & Document Processing tools