jnweiger / pdfcompareLinks
compare two PDF files, write a resulting PDF with highlighted changes
☆57Updated last year
Alternatives and similar repositories for pdfcompare
Users that are interested in pdfcompare are comparing it to the libraries listed below
Sorting:
- A PDF comparison utility in Python.☆502Updated last year
- PDF Extraction Toolkit☆42Updated 5 years ago
- ☆39Updated 10 years ago
- Extract tables from PDF pages.☆298Updated 5 years ago
- OCR for DjVu☆47Updated 3 years ago
- Python binding to libpoppler with focus on text extraction☆97Updated 3 years ago
- PDF to XML ALTO file converter☆257Updated last month
- ☆75Updated 4 years ago
- Get semantic HTML from PDFs, recover lost text, tables, data... in bulk.☆36Updated last year
- Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format☆46Updated 8 months ago
- Automatic de-keystoning for single camera DIY book scanners☆25Updated 9 years ago
- A simple viewer and inspection tool for text boxes in PDF documents☆96Updated 3 years ago
- A more complete example of programming with PDFMiner, which continues where the default documentation stops☆216Updated 6 years ago
- smoothscan is a tool to convert scanned text into a vectorized output form.☆67Updated 12 years ago
- The hOCR Embedded OCR Workflow and Output Format☆75Updated last year
- a utility to extract the title from a PDF file☆143Updated 10 months ago
- my take at a PDF text extraction utility☆25Updated 10 years ago
- PDF Command Line Tools Source☆266Updated this week
- A library for extracting tables from PDF files☆92Updated 5 years ago
- Command line tool to extract figures, tables, and captions from scholarly documents in PDF form.☆130Updated 7 years ago
- A wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR …☆67Updated last year
- web interface for recoll desktop search☆291Updated 5 years ago
- The open source tools for building, maintaining and deploying Topic Maps-based applications.☆57Updated last week
- Textricator is a tool to extract text from documents and generate structured data.☆350Updated 9 months ago
- Extract meaningful content from pdf and psd file, such as texts and images both linked into a common JSON string☆36Updated 7 years ago
- Ergonomic line-by-line transcription of scanned text.☆54Updated 5 years ago
- PDBF - A Toolkit for Creating Janiform Data Documents☆50Updated 9 years ago
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆404Updated last year
- MathWebSearch Implementation☆48Updated 3 years ago
- Wandora is a general purpose information extraction, management and publishing application based on Topic Maps and Java.☆133Updated 2 years ago