Augment line images for improving OCR datasets
☆10Oct 4, 2023Updated 2 years ago
Alternatives and similar repositories for LineAug
Users that are interested in LineAug are comparing it to the libraries listed below
Sorting:
- OCRopus model for Gothic print (Fraktur)☆19Feb 16, 2020Updated 6 years ago
- Some bits of javascript to transcribe scanned pages using PageXML☆17Mar 18, 2024Updated 2 years ago
- Manuals, lexica, OCR test data for PoCoTo and the profiler☆15Jul 2, 2021Updated 4 years ago
- Small collection of PAGE XML related scripts used at the ZPD Würzburg☆12Aug 2, 2024Updated last year
- OCR-D post-correction with encoder-attention-decoder LSTMs☆13May 1, 2025Updated 10 months ago
- Training data from "Hauptphase I" of project "Digitalisierung historischer deutscher Zeitungen"☆12Dec 17, 2021Updated 4 years ago
- OCR-D python tools☆33Aug 16, 2024Updated last year
- Wrapper for the kraken OCR engine☆12Jul 12, 2025Updated 8 months ago
- DFKI Layout Detection for OCR-D☆47May 1, 2025Updated 10 months ago
- An Editor for creating simple or complex OCR workflows☆17Jun 13, 2024Updated last year
- Command line tool to convert page layout files to the latest PAGE XML format. It supports all previous versions of the PAGE format as wel…☆24Jan 30, 2021Updated 5 years ago
- Check your modified Ground Truth files with visual support!☆10Jan 31, 2024Updated 2 years ago
- Polytonic Greek OCR tool suite based on Ocropus 0.7☆13Jul 5, 2023Updated 2 years ago
- OCR-D wrapper for detectron2 based segmentation models☆17May 1, 2025Updated 10 months ago
- NewsEye / READ OCR training dataset from Austrian Newspapers (1864–1911)☆18Oct 31, 2025Updated 4 months ago
- Automatic text comparison with an extendable variance classifier☆13Sep 11, 2023Updated 2 years ago
- A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.☆195Updated this week
- ☆10Nov 1, 2025Updated 4 months ago
- Python-based research framework for developing, organizing, and deploying Deep Learning models powered by Tensorflow.☆12Jun 27, 2022Updated 3 years ago
- ☆17Sep 25, 2021Updated 4 years ago
- An extensible viewer for OCR-D mets.xml files☆23May 30, 2024Updated last year
- Double-checked Gold Standard Data for Training and Testing OCR Engines☆21Dec 31, 2022Updated 3 years ago
- Tutorial on NE processing for Digital Humanities - DH Utrech 2019☆24Jul 18, 2019Updated 6 years ago
- Docker container for ocropus3 OCR system☆12Aug 19, 2018Updated 7 years ago
- Transkriptionen von Fibeln (19. Jahrhundert)☆11Oct 31, 2025Updated 4 months ago
- ☆10Jan 22, 2023Updated 3 years ago
- convert PubLayNet data into METS/PAGE-XML☆10Mar 17, 2020Updated 6 years ago
- This repository contains NLU related material for the I833 Deep Learning course at University of Applied Sciences Dresden☆13Dec 16, 2024Updated last year
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆201May 21, 2025Updated 9 months ago
- ☆10Mar 16, 2023Updated 3 years ago
- Python bindings for a c++ based implementation of the Nested Hierarchical Pitman-Yor Language model☆13Nov 24, 2016Updated 9 years ago
- Python bindings for libwapiti☆67Dec 9, 2019Updated 6 years ago
- A documentation for FAIR GPT, a virtual RDM consultant☆15Oct 10, 2024Updated last year
- OCR & Ground Truth Resources☆78May 3, 2022Updated 3 years ago
- JS for overlaying OCR on image using HOCR formatted HTML☆26Jul 30, 2016Updated 9 years ago
- Extract the MODS/ALTO metadata of a bunch of METS/ALTO files into pandas DataFrames for data analysis☆13Aug 21, 2025Updated 6 months ago
- Read-only unofficial mirror of Pynini☆17May 7, 2019Updated 6 years ago
- An OCR evaluation tool☆69Aug 22, 2025Updated 6 months ago
- OCR-D-compliant page segmentation☆67Nov 19, 2025Updated 4 months ago