maxpmaxp / pdfreaderLinks
Python API for PDF documents
☆124Updated last year
Alternatives and similar repositories for pdfreader
Users that are interested in pdfreader are comparing it to the libraries listed below
Sorting:
- A Python tool to help extracting information from structured PDFs.☆427Updated 2 weeks ago
- A Python implementation of Lunr.js 🌖☆204Updated 10 months ago
- Python binding to Poppler-cpp pdf library☆113Updated last year
- Python library for fast approximate string matching using Jaro and Jaro-Winkler similarity☆77Updated 2 years ago
- A utility to read and write PDFs with Python☆338Updated 4 years ago
- Python interface to Apache PDFBox command-line tools.☆79Updated 3 years ago
- python library to simplify working with jsonlines and ndjson data☆307Updated last year
- Extract docx headers, footers, (formatted) text, footnotes, endnotes, properties, and images.☆201Updated this week
- Pythonic search engine based on PyLucene.☆132Updated last month
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆155Updated 2 years ago
- Simplify DOCX files to JSON☆256Updated last year
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity☆123Updated 3 months ago
- Python library that reads JSON files of any size.☆197Updated 2 years ago
- Pandoc (Python Library)☆177Updated 4 months ago
- Python package for Google's diff-match-patch native C++ implementation.☆87Updated last year
- Simple python wrapper to convert HTML to PDF with headless Chrome via selenium☆73Updated last month
- Find parts of long text or data, allowing for some changes/typos.☆338Updated 2 months ago
- mirror of https://hg.reportlab.com/hg-public/reportlab☆78Updated 2 weeks ago
- Pure-python library for adding annotations to PDFs☆212Updated 4 years ago
- Efficient string matching with regular expressions☆146Updated this week
- An open-source package for python to clean raw text data☆74Updated 2 years ago
- Library for unit extraction - fork of quantulum for python3☆145Updated last year
- A purely-functional HTML builder for Python. Think JSX rather than templates.☆102Updated last year
- Whoosh is a fast, featureful full-text indexing and searching library implemented in pure Python.☆227Updated last month
- 🦉 Modern high-performance serialization utilities for Python (JSON, MessagePack, Pickle)☆482Updated 2 months ago
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacing☆76Updated 2 weeks ago
- A modern CSS selector implementation for BeautifulSoup☆263Updated this week
- Extract datetimes and durations from natural language text as Python objects. Supports ranges, lists, and more.☆121Updated 6 months ago
- Accurately find/replace/remove emojis in text strings☆163Updated 2 years ago
- Parse numbers written in natural language☆126Updated last year