The simplest way to extract text from PDFs in Python
☆428Jul 7, 2022Updated 3 years ago
Alternatives and similar repositories for slate
Users that are interested in slate are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A fast and friendly PDF scraping library.☆780Oct 17, 2023Updated 2 years ago
- Python wrapper for xpdf☆19Nov 28, 2019Updated 6 years ago
- Python PDF Parser (Not actively maintained). Check out pdfminer.six.☆5,291Dec 7, 2022Updated 3 years ago
- A semantic role labeling system for the Sumerian language. A Google Summer of Code '18 initiative.☆16Feb 10, 2023Updated 3 years ago
- A library for extracting tables from PDF files☆93Aug 2, 2020Updated 5 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- extract text from any document. no muss. no fuss.☆4,600May 7, 2026Updated last month
- A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files☆10,036Updated this week
- pyaddress is an address parsing library, taking the guesswork out of using addresses in your applications. We use it as part of our apart…☆100Sep 16, 2019Updated 6 years ago
- Circular buffer implementation in Nim☆10Apr 21, 2023Updated 3 years ago
- Bitarray implementation in Nim☆10Dec 14, 2020Updated 5 years ago
- Community maintained fork of pdfminer - we fathom PDF☆6,986Mar 13, 2026Updated 2 months ago
- A simple, system independent infrastructure for performing web scraping. Utilizes Vagrant virtualbox interface and puppet provisioning to…☆24Jul 30, 2014Updated 11 years ago
- pdfrw is a pure Python library that reads and writes PDFs☆1,907Apr 29, 2024Updated 2 years ago
- A command-line tool to better visualize crowded dot density maps.☆153Dec 27, 2014Updated 11 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Simple Bayesian spam rating in Python that is easy to use, small, contained in a single file, and doesn't require any external modules.☆30Mar 11, 2015Updated 11 years ago
- Word Graph utility built with NLTK and TextBlob☆18Aug 16, 2013Updated 12 years ago
- Images of Text to Text: Call Tesseract from Python and OCR a directory of pdfs☆16Oct 7, 2019Updated 6 years ago
- A DSL to build Lucene text queries in Python.☆38Jan 5, 2017Updated 9 years ago
- A Seattle Times investigation on Washington's "evil intent" laws☆20Sep 28, 2015Updated 10 years ago
- A more complete example of programming with PDFMiner, which continues where the default documentation stops☆216Dec 3, 2019Updated 6 years ago
- A collection of Fabric utilities largely for Django deployment.☆28Apr 15, 2013Updated 13 years ago
- This semester we will work together to gather, analyze and visualize numbers you need to understand your audience and to tell interactive…☆17Oct 5, 2018Updated 7 years ago
- Task manager built around the gevent green threads library.☆17Feb 3, 2019Updated 7 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Slinky, a high-performance web crawler / text analytics in Python, Redis, Hadoop, R, Gephi☆40Aug 30, 2010Updated 15 years ago
- Sublime Text package for d3.js☆32Aug 1, 2013Updated 12 years ago
- Redis tcp map for postfix☆12Jun 28, 2024Updated last year
- Materials for my PyData Seattle talk☆21Aug 6, 2015Updated 10 years ago
- MobileVis - a gallery of mobile data visualizations☆29Jan 23, 2023Updated 3 years ago
- Context manager to maintain your temporary directories/files.☆17Jan 23, 2023Updated 3 years ago
- Workshop materials for scraping Twitter with Python