cbrunet / python-popplerLinks
Python binding to Poppler-cpp pdf library
β111Updated last year
Alternatives and similar repositories for python-poppler
Users that are interested in python-poppler are comparing it to the libraries listed below
Sorting:
- Python API for PDF documentsβ124Updated last year
- A Python implementation of Lunr.js πβ199Updated 6 months ago
- A Python tool to help extracting information from structured PDFs.β412Updated last month
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarityβ119Updated 6 months ago
- Truly universal encoding detector in pure Pythonβ700Updated 3 weeks ago
- Read SVG files and convert them to other formats.β345Updated this week
- python library to simplify working with jsonlines and ndjson dataβ299Updated last year
- Pure-Python full-text search libraryβ639Updated last year
- Pandoc (Python Library)β165Updated last year
- Pure python implementation of identifying files based off their magic numbersβ212Updated 2 months ago
- A utility to read and write PDFs with Pythonβ337Updated 3 years ago
- A modern CSS selector implementation for BeautifulSoupβ247Updated 2 weeks ago
- Whoosh is a fast, featureful full-text indexing and searching library implemented in pure Python.β220Updated last week
- A Python implementation of the JSON5 data formatβ254Updated last month
- mirror of https://hg.reportlab.com/hg-public/reportlabβ74Updated 3 weeks ago
- A fast, comprehensive, ISOΒ 639 library.β43Updated last month
- Python interface to Apache PDFBox command-line tools.β77Updated 2 years ago
- A Python library to sanitize/validate a string such as filenames/file-paths/etc.β270Updated 3 months ago
- β535Updated last week
- rstr is a helper module for easily generating random strings of various types. It could be useful for fuzz testing, generating dummy dataβ¦β94Updated 6 months ago
- XPath 1.0/2.0/3.0/3.1 parsers and selectors for ElementTree and lxmlβ85Updated 3 weeks ago
- Fast Autocomplete: When Elastcsearch suggestions are not fast and flexible enoughβ286Updated last week
- Parse numbers written in natural languageβ123Updated 10 months ago
- Append/Concatenate .docx documentsβ120Updated last year
- Extract docx headers, footers, (formatted) text, footnotes, endnotes, properties, and images.β189Updated last week
- Python Powerful Timeout Decorator that can be used safely on classes, methods, class methodsβ160Updated 2 months ago
- Complete lxml external type annotationβ67Updated last week
- Simplify DOCX files to JSONβ250Updated 11 months ago
- universal character encoding detectorβ60Updated last year
- Powerful polling utility in Pythonβ61Updated 2 months ago