py-pdf / sample-files
Files which can be used to test PDF readers
☆39Updated last month
Alternatives and similar repositories for sample-files
Users that are interested in sample-files are comparing it to the libraries listed below
Sorting:
- A simple python wrapper for PDFium.☆17Updated 3 years ago
- PDF 2.0 example files☆90Updated 4 months ago
- Easy to use PDF CLI tool powered by PDFium and go-pdfium☆27Updated 2 months ago
- RUPS is an acronym for Reading and Updating PDF Syntax. RUPS is a tool built on top of iText® that allows you to look inside a PDF docume…☆309Updated this week
- Inspect how the PDF's structure looks.☆24Updated last year
- CMap Resources☆271Updated last year
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆391Updated 9 months ago
- Library used to deskew a scanned document☆461Updated last week
- A curated list of resources around PDF files☆129Updated 9 months ago
- Python library to extract tabular data from images and scanned PDFs☆278Updated 9 months ago
- Office OpenXML reader and writer in Rust☆118Updated 10 months ago
- IPP sample implementations.☆237Updated 2 months ago
- A Rust wrapper around PDFium allowing you to render PDFs from Rust☆26Updated 3 years ago
- Web interface for recognizing text, proofreading OCR, and creating fully-digitized documents.☆176Updated 2 weeks ago
- Converts InDesign IDML to XML☆16Updated 10 months ago
- gcv2hocr converts from Google Cloud Vision OCR output to hocr to make a searchable pdf.☆106Updated 4 years ago
- A step-by-step C# implementation of the Docstrum algorithm☆23Updated 4 years ago
- Tools for extract figure, table, text, .. from a pdf document.☆32Updated 4 years ago
- Pure-python library for adding annotations to PDFs☆202Updated 4 years ago
- faster page_dewarp in C++☆32Updated 3 years ago
- ☆948Updated 7 months ago
- Java GUI frontend for Tesseract OCR engine☆64Updated 2 months ago
- Document Layout Analysis☆372Updated this week
- Demos, examples and utilities using PyMuPDF☆656Updated 10 months ago
- Document image dewarping library using a cubic sheet model☆153Updated last week
- Python bindings to PDFium☆568Updated this week
- CLI tool to extract (meta)data from PDF and manipulate PDF files☆145Updated last week
- Pdf to Png conversion service in Rust☆22Updated 2 years ago
- JBIG2 Encoder☆17Updated 2 months ago
- Building scantailor and its dependencies☆58Updated last year