ashutoshvarma / pyxpdfLinks
Fast and memory-efficient Python PDF Parser based on xpdf sources
☆44Updated 2 years ago
Alternatives and similar repositories for pyxpdf
Users that are interested in pyxpdf are comparing it to the libraries listed below
Sorting:
- Python API for PDF documents☆124Updated last year
- Python binding to Poppler-cpp pdf library☆113Updated last year
- A Python tool to help extracting information from structured PDFs.☆427Updated 2 weeks ago
- Python library for fast approximate string matching using Jaro and Jaro-Winkler similarity☆77Updated 2 years ago
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity☆123Updated 3 months ago
- A Python binding of SQLite Full Text Search Tokenizer☆50Updated 2 months ago
- Parse numbers written in natural language☆126Updated last year
- Fastest general-purpose parsing library for Python with a familiar API☆49Updated 7 months ago
- Whoosh is a fast, featureful full-text indexing and searching library implemented in pure Python.☆227Updated last month
- Extract docx headers, footers, (formatted) text, footnotes, endnotes, properties, and images.☆201Updated this week
- PDFViewer is a GUI tool, written using python3 and tkinter, which lets you view PDF documents.☆82Updated 4 years ago
- CyDifflib is a fast implementation of difflib's algorithms, which can be used as a drop-in replacement.☆32Updated 9 months ago
- Cython based high performance alternative to Python (re) module for doing basic pattern matching on large data-set..☆11Updated 3 years ago
- Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application with a focus on embedded s…☆142Updated 4 years ago
- A simple python wrapper for PDFium.☆17Updated 4 years ago
- A utility to read and write PDFs with Python☆338Updated 4 years ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆184Updated 8 months ago
- Find parts of long text or data, allowing for some changes/typos.☆338Updated 2 months ago
- A general purpose PDF text-layer redaction tool for Python 2/3.☆208Updated last year
- Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.☆57Updated last year
- A Python implementation of Lunr.js 🌖☆204Updated 10 months ago
- A fast RLock implementation for CPython☆35Updated last year
- 🚀 Extremely fast fuzzy matcher & spelling checker in Python!☆30Updated 4 years ago
- Stripping rtf to plain old text☆112Updated 8 months ago
- Python Powerful Timeout Decorator that can be used safely on classes, methods, class methods☆164Updated 2 months ago
- Pandoc (Python Library)☆177Updated 4 months ago
- Python library of 60+ commonly-used validator functions☆131Updated 3 years ago
- Python package for Google's diff-match-patch native C++ implementation.☆87Updated last year
- Fast Base64 encoding/decoding in Python☆170Updated last week
- Safely evaluate AST nodes without side effects☆50Updated 3 weeks ago