atlanhq/camelot

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/atlanhq/camelot)

atlanhq / camelot

Camelot: PDF Table Extraction for Humans

☆3,716

Alternatives and similar repositories for camelot

Users that are interested in camelot are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

camelot-dev / excalibur
View on GitHub
A web interface to extract tabular data from PDFs
☆1,811May 20, 2026Updated 2 months ago
camelot-dev / camelot
View on GitHub
A Python library to extract tabular data from PDFs
☆3,790Updated this week
chezou / tabula-py
View on GitHub
Simple wrapper of tabula-java: extract table from PDF into pandas DataFrame
☆2,315Dec 5, 2024Updated last year
WZBSocialScienceCenter / pdftabextract
View on GitHub
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
☆2,255Jun 24, 2022Updated 4 years ago
tabulapdf / tabula
View on GitHub
Tabula is a tool for liberating data tables trapped inside PDF files
☆7,451Mar 14, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
jsvine / pdfplumber
View on GitHub
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
☆10,598Jul 20, 2026Updated last week
Squarespace / datasheets
View on GitHub
Read data from, write data to, and modify the formatting of Google Sheets
☆625Dec 19, 2023Updated 2 years ago
pdfminer / pdfminer.six
View on GitHub
Community maintained fork of pdfminer - we fathom PDF
☆7,009Mar 13, 2026Updated 4 months ago
snipsco / snips-nlu
View on GitHub
Snips Python library to extract meaning from text
☆3,975May 22, 2023Updated 3 years ago
santinic / pampy
View on GitHub
Pampy: The Pattern Matching for Python you always dreamed of.
☆3,527Jan 16, 2025Updated last year
explosion / spaCy
View on GitHub
💫 Industrial-strength Natural Language Processing (NLP) in Python
☆33,774May 19, 2026Updated 2 months ago
simonw / datasette
View on GitHub
An open source multi-tool for exploring and publishing data
☆11,310Updated this week
euske / pdfminer
View on GitHub
Python PDF Parser (Not actively maintained). Check out pdfminer.six.
☆5,279Dec 7, 2022Updated 3 years ago
s0md3v / Photon
View on GitHub
Incredibly fast crawler designed for OSINT.
☆13,063Feb 10, 2026Updated 5 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
google / python-fire
View on GitHub
Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.
☆28,214Jul 1, 2026Updated 3 weeks ago
cool-RR / PySnooper
View on GitHub
Never use print for debugging again
☆16,579Jun 8, 2026Updated last month
thoppe / pixelhouse
View on GitHub
A minimalist drawing library for making beautiful animations in python
☆353Sep 21, 2023Updated 2 years ago
psf / requests-html
View on GitHub
Pythonic HTML Parsing for Humans™
☆13,824Apr 16, 2024Updated 2 years ago
plotly / dash
View on GitHub
Data Apps & Dashboards for Python. No JavaScript Required.
☆24,351Updated this week
facebookresearch / pytext
View on GitHub
A natural language modeling framework based on PyTorch
☆6,295Oct 17, 2022Updated 3 years ago
deanmalmgren / textract
View on GitHub
extract text from any document. no muss. no fuss.
☆4,676Jul 11, 2026Updated 2 weeks ago
tabulapdf / tabula-java
View on GitHub
Extract tables from PDF files
☆2,036Mar 19, 2025Updated last year
vaexio / vaex
View on GitHub
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per s…
☆8,511Apr 1, 2026Updated 3 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
modin-project / modin
View on GitHub
Modin: Scale your Pandas workflows by changing a single line of code
☆10,393Feb 10, 2026Updated 5 months ago
vega / altair
View on GitHub
Declarative visualization library for Python
☆10,437Jul 20, 2026Updated last week
danburzo / percollate
View on GitHub
A command-line tool to turn web pages into readable PDF, EPUB, HTML, or Markdown docs.
☆4,659Aug 29, 2025Updated 10 months ago
vi3k6i5 / flashtext
View on GitHub
Extract Keywords from sentence or Replace keywords in sentences.
☆5,716Apr 13, 2025Updated last year
Erotemic / ubelt
View on GitHub
A Python utility library with a stdlib like feel and extra batteries. Paths, Progress, Dicts, Downloads, Caching, Hashing: ubelt makes it…
☆740Apr 28, 2026Updated 2 months ago
flairNLP / flair
View on GitHub
A very simple framework for state-of-the-art Natural Language Processing (NLP)
☆14,382Oct 27, 2025Updated 8 months ago
mahmoud / boltons
View on GitHub
🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library.…
☆6,906Jul 18, 2026Updated last week
doc-analysis / TableBank
View on GitHub
TableBank: A Benchmark Dataset for Table Detection and Recognition
☆1,080Aug 12, 2024Updated last year
seatgeek / fuzzywuzzy
View on GitHub
Fuzzy String Matching in Python
☆9,262Feb 24, 2023Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
vibora-io / vibora
View on GitHub
Fast, asynchronous and elegant Python web framework.
☆5,588Dec 23, 2020Updated 5 years ago
avidLearnerInProgress / pyCAIR
View on GitHub
Content aware image resizing
☆470Apr 3, 2024Updated 2 years ago
py-pdf / pypdf
View on GitHub
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
☆10,130Updated this week
python-pendulum / pendulum
View on GitHub
Python datetimes made easy
☆6,673Jul 6, 2026Updated 2 weeks ago
dask / dask
View on GitHub
Parallel computing with task scheduling
☆13,871Updated this week
pmaupin / pdfrw
View on GitHub
pdfrw is a pure Python library that reads and writes PDFs
☆1,911Apr 29, 2024Updated 2 years ago
psf / black
View on GitHub
The uncompromising Python code formatter
☆41,764Updated this week