JoshData / pdf-redactor
A general purpose PDF text-layer redaction tool for Python 2/3.
☆190Updated 8 months ago
Alternatives and similar repositories for pdf-redactor:
Users that are interested in pdf-redactor are comparing it to the libraries listed below
- A utility to read and write PDFs with Python☆334Updated 3 years ago
- Python module to drive the awesome pdftk binary.☆148Updated last year
- Python interface to Apache PDFBox command-line tools.☆75Updated 2 years ago
- Pure-python library for adding annotations to PDFs☆199Updated 3 years ago
- Simple PDF text extraction☆898Updated this week
- Python binding to Poppler-cpp pdf library☆105Updated 5 months ago
- Simple, Pythonic extraction of text, shapes and images from PDFs☆79Updated 4 years ago
- A fast and friendly PDF scraping library.☆773Updated last year
- Python API for PDF documents☆118Updated 5 months ago
- A tool for converting PDF into hOCR with text, tables, and figures being recognized and preserved.☆435Updated last year
- CSS related utilities (parsing, serialization, etc) for python☆32Updated 4 months ago
- python app/framework for 'all things ISBN' including metadata, descriptions, covers...☆224Updated last year
- A utility to read and write PDFs with Python☆72Updated 7 months ago
- A more complete example of programming with PDFMiner, which continues where the default documentation stops☆214Updated 5 years ago
- PDF to XML ALTO file converter☆223Updated last month
- A Python library for extracting titles, images, descriptions and canonical urls from HTML.☆147Updated 4 years ago
- Convert a PDF via OCR to a TXT file in UTF-8 encoding☆144Updated last year
- ezodf is a Python package to create new or open existing OpenDocument (ODF) files to extract, add, modify or delete document data, forked…☆66Updated 2 years ago
- The most basic Text::Unidecode port (licensed under Artistic License or GPL or GPLv2+ - choose whatever you want)☆65Updated last year
- mirror of https://hg.reportlab.com/hg-public/reportlab☆71Updated this week
- small collection of python scripts for pdf manipulation☆95Updated 10 months ago
- Hy-phen-ation made easy☆207Updated 3 weeks ago
- Python binding to libpoppler with focus on text extraction☆97Updated 3 years ago
- Convert html to docx☆76Updated 7 months ago
- Python address detector and parser☆206Updated last year
- a utility to extract the title from a PDF file☆138Updated 2 months ago
- A python based HTML to text conversion library, command line client and Web service.☆285Updated last month
- A pure python based utility to extract text and images from docx files.☆531Updated last year
- gcv2hocr converts from Google Cloud Vision OCR output to hocr to make a searchable pdf.☆105Updated 4 years ago
- A library for extracting tables from PDF files☆89Updated 4 years ago