maxpmaxp / pdfreaderLinks
Python API for PDF documents
☆124Updated last year
Alternatives and similar repositories for pdfreader
Users that are interested in pdfreader are comparing it to the libraries listed below
Sorting:
- A Python tool to help extracting information from structured PDFs.☆412Updated this week
- Python binding to Poppler-cpp pdf library☆111Updated last year
- python library to simplify working with jsonlines and ndjson data☆299Updated last year
- Python interface to Apache PDFBox command-line tools.☆77Updated 2 years ago
- A Python implementation of Lunr.js 🌖☆199Updated 6 months ago
- A utility to read and write PDFs with Python☆337Updated 3 years ago
- Extract dates from text☆65Updated 4 years ago
- Extract docx headers, footers, (formatted) text, footnotes, endnotes, properties, and images.☆191Updated last week
- Python library for fast approximate string matching using Jaro and Jaro-Winkler similarity☆74Updated last year
- Whoosh is a fast, featureful full-text indexing and searching library implemented in pure Python.☆220Updated last week
- mirror of https://hg.reportlab.com/hg-public/reportlab☆74Updated this week
- Simple, Pythonic extraction of text, shapes and images from PDFs☆80Updated 5 years ago
- Pandoc (Python Library)☆165Updated last year
- Parse numbers written in natural language☆123Updated 10 months ago
- Pythonic search engine based on PyLucene.☆130Updated 3 weeks ago
- Simplify DOCX files to JSON☆250Updated 11 months ago
- A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any othe…☆68Updated 2 years ago
- Atom, RSS and JSON feed parser for Python 3☆117Updated 2 years ago
- Python package for Google's diff-match-patch native C++ implementation.☆82Updated last year
- URL normalization for Python☆98Updated 4 months ago
- A general purpose PDF text-layer redaction tool for Python 2/3.☆204Updated last year
- Efficient string matching with regular expressions☆144Updated last week
- Easy rate-limiting for python requests☆107Updated last week
- The most basic Text::Unidecode port (licensed under Artistic License or GPL or GPLv2+ - choose whatever you want)☆67Updated 2 years ago
- Next-generation Punkt sentence boundary detection with zero dependencies☆17Updated last month
- A fast, comprehensive, ISO 639 library.☆43Updated last month
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆154Updated 2 years ago
- Common interface for data container classes☆68Updated 2 weeks ago
- Python library that reads JSON files of any size.☆198Updated 2 years ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆154Updated this week