CITlabRostock / citlab-article-separation-new
Modules used for separating articles in (historical) newspapers and similar documents. This repository is part of the European Union's Horizon 2020 project NewsEye. For more information about the project see https://www.newseye.eu/.
☆18Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for citlab-article-separation-new
- ☆10Updated 2 years ago
- Some bits of javascript to transcribe scanned pages using PageXML☆17Updated 7 months ago
- You Actually Look Twice At it☆29Updated last month
- Training data from "Hauptphase I" of project "Digitalisierung historischer deutscher Zeitungen"☆12Updated 2 years ago
- Layout analysis to find layout elements in documents (similar to P2PaLA)☆17Updated last week
- OCRopus model for Gothic print (Fraktur)☆18Updated 4 years ago
- An extensible viewer for OCR-D mets.xml files☆20Updated 5 months ago
- Convert PAGE (v. 2019) to ALTO (v. 2.0 - 4.2)☆13Updated 3 weeks ago
- Named entity annotation tool☆27Updated last year
- ☆47Updated this week
- Python tools for performing various operations on ALTO XML files☆39Updated last year
- Transkriptionen von Fibeln (19. Jahrhundert)☆11Updated 8 months ago
- Reichsanzeiger-NLP: NER/NEL corpus for the German historical newspaper "Deutscher Reichsanzeiger und Preußischer Staatsanzeiger" (1819–19…☆14Updated 3 weeks ago
- OCR-D wrapper for prima-pagetopdf☆8Updated last week
- Check your modified Ground Truth files with visual support!☆10Updated 9 months ago
- A Pythonic API and some command line tools to access the Transkribus server via its REST API☆27Updated last year
- Docker integration of Kitodo.Production and OCR-D☆9Updated 7 months ago
- Convert between Tesseract hOCR and ALTO XML using XSL stylesheets☆52Updated 3 months ago
- ☆26Updated 3 months ago
- Named Entity Recognition☆16Updated this week
- ☆14Updated 2 years ago
- Conversions between various OCR formats☆71Updated last year
- A repository for online OCRD training infrastructure.☆13Updated 4 years ago
- CERberus -- guardian against character errors☆26Updated 8 months ago
- Training files for Greek cursive script (in early print)☆14Updated 3 years ago
- A repository for illustrating the transformation of a PAGE XML file into XML-TEI format, resulting from experimentations made for the LEC…☆15Updated 2 years ago
- A suite of batches and tools for OCR tasks.☆71Updated last year
- Extract the MODS/ALTO metadata of a bunch of METS/ALTO files into pandas DataFrames for data analysis☆11Updated 3 months ago
- nnanno is a collection of tools that sample, annotate and apply computer vision to the Newspaper Navigator dataset☆17Updated 3 weeks ago
- Augment line images for improving OCR datasets☆9Updated last year