jnweiger / pdfcompare
compare two PDF files, write a resulting PDF with highlighted changes
☆54Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for pdfcompare
- PDF Extraction Toolkit☆41Updated 4 years ago
- Python binding to libpoppler with focus on text extraction☆98Updated 2 years ago
- List of tools for dealing with the wonderful PDF format.☆45Updated 4 years ago
- OCR for DjVu☆45Updated 2 years ago
- Command line tool to extract figures, tables, and captions from scholarly documents in PDF form.☆129Updated 6 years ago
- A PDF comparison utility in Python.☆454Updated 5 months ago
- Language checker and hyphenator extension for LibreOffice☆13Updated 4 years ago
- pdf2xml convertor based on Xpdf library - modified version☆27Updated 6 years ago
- An efficient data structure for fast string similarity searches☆23Updated 3 years ago
- ☆71Updated last year
- A library for extracting tables from PDF files☆90Updated 11 years ago
- ☆36Updated 9 years ago
- Extract meaningful content from pdf and psd file, such as texts and images both linked into a common JSON string☆36Updated 6 years ago
- A more complete example of programming with PDFMiner, which continues where the default documentation stops☆215Updated 4 years ago
- The CIS OCR PostCorrectionTool☆40Updated 2 years ago
- Get semantic HTML from PDFs, recover lost text, tables, data... in bulk.☆28Updated this week
- 'ocr-evaluation-tools' from http://ancientgreekocr.org/. Tools to test OCR accuracy.☆22Updated 6 years ago
- A web based data mining workflow platform with real-time analysis capabilities☆49Updated 2 years ago
- An index data structure for approximate string search.☆23Updated 5 years ago
- copy of pdftohtml code with enhancements☆25Updated last year
- PDF Table Extractor - repository to hold revisable version of code from https://www.cvast.tuwien.ac.at/projects/pdf2table by Burcu Yildiz☆38Updated 8 months ago
- A selection of test lines of several early printed books as well as the corresponding individual OCRopus models and mixed models.☆10Updated 6 years ago
- A tool for semantic relation extraction. The program finds pairs of semantically related words based on the text definitions coming from …☆28Updated 10 years ago
- liberate all kinds of data from PDF and other unstructural format and make the information machine-readable and visualizeable for popul…☆27Updated 6 years ago
- Wrapper around pixel classifier☆9Updated 2 years ago
- Scripts and results from our OCR roundup, available on Source☆150Updated 5 years ago
- The hOCR Embedded OCR Workflow and Output Format☆74Updated 3 months ago
- A (new) cairo backend for Matplotlib.☆106Updated 2 weeks ago
- Converts XML to LaTeX☆43Updated this week
- Inkscape extension to assist creating dimension annotations.☆65Updated 2 years ago