PRImA-Research-Lab / prima-page-converterView external linksLinks
Command line tool to convert page layout files to the latest PAGE XML format. It supports all previous versions of the PAGE format as well as ALTO XML, FineReader XML, and HOCR
☆24Jan 30, 2021Updated 5 years ago
Alternatives and similar repositories for prima-page-converter
Users that are interested in prima-page-converter are comparing it to the libraries listed below
Sorting:
- Converters for various file formats used for representing OCR☆12Apr 30, 2025Updated 9 months ago
- Java command line tool to convert PAGE XML files with layout and text content to PDF☆10Apr 27, 2020Updated 5 years ago
- Check your modified Ground Truth files with visual support!☆10Jan 31, 2024Updated 2 years ago
- Java based viewer for PAGE XML files (layout + text content). Also supports ALTO XML, FineReader XML, and HOCR.☆35May 25, 2023Updated 2 years ago
- Core libraries by the PRImA Research Lab☆16Jul 30, 2024Updated last year
- Convert between Tesseract hOCR and ALTO XML using XSL stylesheets☆59Sep 25, 2025Updated 4 months ago
- Manuals, lexica, OCR test data for PoCoTo and the profiler☆15Jul 2, 2021Updated 4 years ago
- Simple app for visual editing of Page XML files☆31Sep 25, 2025Updated 4 months ago
- A repository for online OCRD training infrastructure.☆13Aug 20, 2020Updated 5 years ago
- Convert PAGE (v. 2019) to ALTO (v. 2.0 - 4.2)☆14Jan 20, 2026Updated 3 weeks ago
- Augment line images for improving OCR datasets☆10Oct 4, 2023Updated 2 years ago
- Earley based parsing tools for XSLT☆10Oct 8, 2020Updated 5 years ago
- OCRopus model for Gothic print (Fraktur)☆19Feb 16, 2020Updated 5 years ago
- Docker container for ocropus3 OCR system☆12Aug 19, 2018Updated 7 years ago
- ☆10Aug 5, 2019Updated 6 years ago
- ☆10Mar 16, 2023Updated 2 years ago
- ☆14Sep 12, 2019Updated 6 years ago
- An OCR evaluation tool☆69Aug 22, 2025Updated 5 months ago
- ☆32Aug 29, 2025Updated 5 months ago
- ☆17Sep 25, 2021Updated 4 years ago
- Named Entity Recognition tool for Europeana Newspapers☆14Apr 5, 2018Updated 7 years ago
- CoffeePot releases and website pages, see nineml/nineml☆13Dec 26, 2025Updated last month
- ☆66Feb 3, 2026Updated last week
- Training data from "Hauptphase I" of project "Digitalisierung historischer deutscher Zeitungen"☆12Dec 17, 2021Updated 4 years ago
- Useful XProc scripts☆12Oct 31, 2017Updated 8 years ago
- Documentation and use cases for ALTO XML☆42Sep 10, 2018Updated 7 years ago
- TensorFlow implementation of a segmentation system for document images.☆35Sep 9, 2018Updated 7 years ago
- OCR-D python tools☆33Aug 16, 2024Updated last year
- XPath/XQuery extension function library for the Saxon XSLT processor☆13Dec 14, 2022Updated 3 years ago
- An invisible-XML processor for XQuery and XSLT☆14Jun 11, 2024Updated last year
- XSLT Functions for Transpect☆13Feb 6, 2026Updated last week
- ☆23Nov 10, 2017Updated 8 years ago
- Update of the ISRI Analytic Tools for OCR Evaluation with UTF-8 support☆59Apr 16, 2021Updated 4 years ago
- 'lat' repository, forked from https://github.com/ryanfb/ancientgreekocr-grc. The final training process for lat.traineddata☆13Jan 13, 2016Updated 10 years ago
- Repository for the deep-learning framework DIVA-DAF which is build with historical document image analysis in mind.☆18Nov 7, 2024Updated last year
- Some bits of javascript to transcribe scanned pages using PageXML☆17Mar 18, 2024Updated last year
- OCR-D post-correction module based on weighted finite-state transducers☆11Jan 13, 2024Updated 2 years ago
- Catalog of functional programming idioms (in XQuery 3.0)☆16Aug 4, 2016Updated 9 years ago
- Automatically exported from code.google.com/p/oxygen-tei☆17Nov 6, 2025Updated 3 months ago