The simplest way to extract text from PDFs in Python
☆428Jul 7, 2022Updated 3 years ago
Alternatives and similar repositories for slate
Users that are interested in slate are comparing it to the libraries listed below
Sorting:
- A fast and friendly PDF scraping library.☆783Oct 17, 2023Updated 2 years ago
- Python wrapper for xpdf☆19Nov 28, 2019Updated 6 years ago
- Python PDF Parser (Not actively maintained). Check out pdfminer.six.☆5,302Dec 7, 2022Updated 3 years ago
- extract text from any document. no muss. no fuss.☆4,458Feb 4, 2026Updated 3 weeks ago
- A slim, non-SWIG Python adapter to CTesseract (Tesseract OCR for C).☆24Apr 25, 2014Updated 11 years ago
- A framework to allow the matching of string entities using customised sets of transformations and matchers, plus a tool to produce the ne…☆34Apr 18, 2017Updated 8 years ago
- Word Graph utility built with NLTK and TextBlob☆18Aug 16, 2013Updated 12 years ago
- A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files☆9,839Updated this week
- Extract tables from PDF pages.☆299Jun 25, 2020Updated 5 years ago
- ☆18Jun 12, 2023Updated 2 years ago
- Sublime Text package for d3.js☆32Aug 1, 2013Updated 12 years ago
- pdfrw is a pure Python library that reads and writes PDFs☆1,911Apr 29, 2024Updated last year
- A repository of materials for a proposed class on automated story bots.☆49Aug 15, 2018Updated 7 years ago
- Community maintained fork of pdfminer - we fathom PDF☆6,909Updated this week
- Simple wrapper of tabula-java: extract table from PDF into pandas DataFrame☆2,315Dec 5, 2024Updated last year
- A more complete example of programming with PDFMiner, which continues where the default documentation stops☆216Dec 3, 2019Updated 6 years ago
- The ARtillery Crater Analysis and Detection Engine (ARCADE) is an experimental computer vision application built using MATLAB. ARCADE sca…☆18Aug 31, 2016Updated 9 years ago
- This semester we will work together to gather, analyze and visualize numbers you need to understand your audience and to tell interactive…☆17Oct 5, 2018Updated 7 years ago
- pyaddress is an address parsing library, taking the guesswork out of using addresses in your applications. We use it as part of our apart…☆100Sep 16, 2019Updated 6 years ago
- Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.☆1,643Feb 22, 2026Updated last week
- A DSL to build Lucene text queries in Python.☆38Jan 5, 2017Updated 9 years ago
- Python notebooks analyzing campaign finance and lobbying activity data from California Secretary of State’s CAL-ACCESS database☆22Mar 3, 2018Updated 7 years ago
- UMD Course on Storytelling with Data Visualization☆19Dec 6, 2015Updated 10 years ago
- A command-line tool to better visualize crowded dot density maps.☆154Dec 27, 2014Updated 11 years ago
- Cultural learnings of dataviz to make benefit glorious profession of journalism.☆21Sep 15, 2016Updated 9 years ago
- Simple program that reads .env file and use it to run given command☆10Mar 5, 2023Updated 2 years ago
- Redis tcp map for postfix☆12Jun 28, 2024Updated last year
- Workshop materials for scraping Twitter with Python☆13May 25, 2016Updated 9 years ago
- Extract text, metadata and references (pdf, url, doi, arxiv) from PDF. Optionally download all referenced PDFs.☆1,071Jun 15, 2023Updated 2 years ago
- Problem Sets for Jour72326: Scraping for Journalists.☆20May 22, 2017Updated 8 years ago
- A rust crate for high-performance content-signing and certificate verification.☆13Oct 22, 2016Updated 9 years ago
- Cproto generates function prototypes and variable declarations from C source code. Cproto can also convert function definitions between t…☆10Jul 19, 2016Updated 9 years ago
- Dark Mint theme for the Plymouth bootsplash tool☆11Dec 3, 2016Updated 9 years ago
- Javascript to present HTML footnotes as a popover.☆45Oct 23, 2014Updated 11 years ago
- Lightweight declarative YAML and XML data binding for Python☆19Mar 23, 2021Updated 4 years ago
- Simple Flask webservice to search through your PDF collection using Whoosh☆11Jul 11, 2014Updated 11 years ago
- An exploratory visualization tool for the analysis of movements between geographic locations☆13Dec 9, 2022Updated 3 years ago
- The Cairo plot for multiple countries☆12Aug 23, 2019Updated 6 years ago
- A probabilistic CKY parser for PCFGs☆19Mar 12, 2014Updated 11 years ago