maxpmaxp / pdfreader
Python API for PDF documents
☆121Updated 8 months ago
Alternatives and similar repositories for pdfreader
Users that are interested in pdfreader are comparing it to the libraries listed below
Sorting:
- Python binding to Poppler-cpp pdf library☆110Updated 8 months ago
- A Python tool to help extracting information from structured PDFs.☆403Updated last month
- Python interface to Apache PDFBox command-line tools.☆75Updated 2 years ago
- Append/Concatenate .docx documents☆111Updated 9 months ago
- Simplify DOCX files to JSON☆235Updated 7 months ago
- mirror of https://hg.reportlab.com/hg-public/reportlab☆72Updated last week
- Pandoc (Python Library)☆156Updated 8 months ago
- A utility to read and write PDFs with Python☆335Updated 3 years ago
- rstr is a helper module for easily generating random strings of various types. It could be useful for fuzz testing, generating dummy data…☆93Updated 2 months ago
- A Python implementation of Lunr.js 🌖☆195Updated 2 months ago
- Extract docx headers, footers, (formatted) text, footnotes, endnotes, properties, and images.☆179Updated this week
- Demos, examples and utilities using PyMuPDF☆659Updated 10 months ago
- Convert html to docx☆78Updated 10 months ago
- A fast, comprehensive, ISO 639 library.☆38Updated 2 months ago
- An open-source package for python to clean raw text data☆69Updated last year
- Blazing fast fuzzy text search for Python.☆43Updated 3 weeks ago
- Simple, Pythonic extraction of text, shapes and images from PDFs☆79Updated 4 years ago
- Parse numbers written in natural language☆114Updated 6 months ago
- Pure-python library for adding annotations to PDFs☆202Updated 4 years ago
- A python library for extracting text from PDFs without losing the formatting of the PDF content.☆77Updated 3 years ago
- python library to simplify working with jsonlines and ndjson data☆292Updated 9 months ago
- Python library for fast approximate string matching using Jaro and Jaro-Winkler similarity☆71Updated last year
- Next-generation Punkt sentence boundary detection with zero dependencies☆17Updated last month
- 🖍️ Highlight text in documents☆107Updated 3 weeks ago
- Read SVG files and convert them to other formats.☆341Updated 4 months ago
- A simple python wrapper for PDFium.☆17Updated 3 years ago
- A low-level PDF creator☆126Updated 6 months ago
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacing☆72Updated 2 weeks ago
- Stripping rtf to plain old text☆101Updated last month
- A modern CSS selector implementation for BeautifulSoup☆236Updated last week