maxpmaxp / pdfreaderLinks
Python API for PDF documents
☆125Updated last year
Alternatives and similar repositories for pdfreader
Users that are interested in pdfreader are comparing it to the libraries listed below
Sorting:
- Python binding to Poppler-cpp pdf library☆114Updated last year
- A Python tool to help extracting information from structured PDFs.☆425Updated last week
- python library to simplify working with jsonlines and ndjson data☆304Updated last year
- A Python implementation of Lunr.js 🌖☆201Updated 9 months ago
- Extract docx headers, footers, (formatted) text, footnotes, endnotes, properties, and images.☆195Updated last week
- A utility to read and write PDFs with Python☆338Updated 4 years ago
- Pandoc (Python Library)☆174Updated 2 months ago
- Python library for fast approximate string matching using Jaro and Jaro-Winkler similarity☆76Updated last year
- Python interface to Apache PDFBox command-line tools.☆78Updated 2 years ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆155Updated 2 years ago
- Pythonic search engine based on PyLucene.☆131Updated last week
- A fast, comprehensive, ISO 639 library.☆47Updated 3 months ago
- Find parts of long text or data, allowing for some changes/typos.☆333Updated 3 weeks ago
- Simplify DOCX files to JSON☆257Updated last year
- Parse numbers written in natural language☆123Updated last year
- Efficient string matching with regular expressions☆146Updated this week
- An open-source package for python to clean raw text data☆72Updated 2 years ago
- Python Simple Object Storage - provides a list and dictionary interface that seamlessly stores data in a file, like a simplified database…☆58Updated 2 years ago
- A modern CSS selector implementation for BeautifulSoup☆252Updated 3 months ago
- Append/Concatenate .docx documents☆123Updated last year
- Python library that reads JSON files of any size.☆196Updated 2 years ago
- rstr is a helper module for easily generating random strings of various types. It could be useful for fuzz testing, generating dummy data…☆98Updated 9 months ago
- A flexible utility for flattening and unflattening dict-like objects in Python.☆186Updated 3 years ago
- Extract dates from text☆66Updated 4 years ago
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity☆121Updated last month
- mirror of https://hg.reportlab.com/hg-public/reportlab☆76Updated 3 weeks ago
- 🦉 Modern high-performance serialization utilities for Python (JSON, MessagePack, Pickle)☆478Updated 2 weeks ago
- CyDifflib is a fast implementation of difflib's algorithms, which can be used as a drop-in replacement.☆30Updated 7 months ago
- XPath 1.0/2.0/3.0/3.1 parsers and selectors for ElementTree and lxml☆86Updated last month
- URL normalization for Python☆99Updated 7 months ago