plangrid / pdf-annotateLinks
Pure-python library for adding annotations to PDFs
☆202Updated 4 years ago
Alternatives and similar repositories for pdf-annotate
Users that are interested in pdf-annotate are comparing it to the libraries listed below
Sorting:
- Python API for PDF documents☆122Updated 9 months ago
- Python interface to Apache PDFBox command-line tools.☆75Updated 2 years ago
- Demos, examples and utilities using PyMuPDF☆664Updated 11 months ago
- A general purpose PDF text-layer redaction tool for Python 2/3.☆196Updated last year
- Python binding to libpoppler with focus on text extraction☆97Updated 3 years ago
- Python binding to Poppler-cpp pdf library☆110Updated 9 months ago
- A Python tool to help extracting information from structured PDFs.☆404Updated 3 weeks ago
- Collection of OCR-related python tools and wrappers from @OCR-D☆128Updated this week
- Software that makes labeling PDFs easy.☆415Updated last year
- Library used to deskew a scanned document☆470Updated this week
- PDF to XML ALTO file converter☆242Updated 2 weeks ago
- A pure python based utility to extract text and images from docx files.☆547Updated 3 months ago
- fault-tolerant Python3 package for searching, navigating, and modifying LaTeX documents☆307Updated 4 months ago
- Convert a PDF via OCR to a TXT file in UTF-8 encoding☆153Updated last year
- A utility to read and write PDFs with Python☆334Updated 3 years ago
- Python client for GROBID Web services☆339Updated last week
- Simplify DOCX files to JSON☆240Updated 8 months ago
- PDF.js + Hypothesis viewer / annotator☆391Updated 4 months ago
- A post-processing tool for scanned sheets of paper.☆1,083Updated 11 months ago
- Extracts and formats text annotations from a PDF file☆593Updated 5 months ago
- A set of tools to allow PDF to XML conversion, utilising Apache Beam and other tools. The aim of this project is to bring multiple tools…☆294Updated 3 years ago
- Tutorial on how to deskew (straighten) text images☆51Updated 3 years ago
- a utility to extract the title from a PDF file☆140Updated 4 months ago
- Linguistic Annotation and Visualization Tool for PDF Documents☆199Updated 5 years ago
- gcv2hocr converts from Google Cloud Vision OCR output to hocr to make a searchable pdf.☆106Updated 4 years ago
- Extract docx headers, footers, (formatted) text, footnotes, endnotes, properties, and images.☆183Updated last week
- A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.☆187Updated 3 weeks ago
- A more complete example of programming with PDFMiner, which continues where the default documentation stops☆214Updated 5 years ago
- Wrapper for pdftohtml that tries to extract paragraph structure☆50Updated 6 years ago
- A project about benchmarking and evaluating existing PDF extraction tools on their semantic abilities to extract the body texts from PDF …☆68Updated 4 years ago