CITlabRostock / citlab-article-separation-new
Modules used for separating articles in (historical) newspapers and similar documents. This repository is part of the European Union's Horizon 2020 project NewsEye. For more information about the project see https://www.newseye.eu/.
☆20Updated 2 years ago
Alternatives and similar repositories for citlab-article-separation-new:
Users that are interested in citlab-article-separation-new are comparing it to the libraries listed below
- Small collection of PAGE XML related scripts used at the ZPD Würzburg☆13Updated 9 months ago
- Convert Transkribus PAGE-XML to standard PAGE-XML☆12Updated 10 months ago
- Named entity annotation tool☆28Updated last year
- Reichsanzeiger-NLP: NER/NEL corpus for the German historical newspaper "Deutscher Reichsanzeiger und Preußischer Staatsanzeiger" (1819–19…☆16Updated 6 months ago
- Training data from "Hauptphase I" of project "Digitalisierung historischer deutscher Zeitungen"☆12Updated 3 years ago
- OCR-D wrapper for prima-pagetopdf☆9Updated this week
- Transkriptionen von Fibeln (19. Jahrhundert)☆11Updated last year
- You Actually Look Twice At it☆34Updated 3 months ago
- ☆12Updated 2 years ago
- Convert PAGE (v. 2019) to ALTO (v. 2.0 - 4.2)☆14Updated 3 weeks ago
- Some bits of javascript to transcribe scanned pages using PageXML☆17Updated last year
- An extensible viewer for OCR-D mets.xml files☆20Updated 11 months ago
- OCRopus model for Gothic print (Fraktur)☆18Updated 5 years ago
- Tools for normalizing the use of some characters and checking file consistencies☆11Updated 4 months ago
- Named Entity Recognition☆19Updated 3 weeks ago
- Python tools for performing various operations on ALTO XML files☆46Updated 2 months ago
- Extract the MODS/ALTO metadata of a bunch of METS/ALTO files into pandas DataFrames for data analysis☆11Updated 5 months ago
- ☆31Updated 3 weeks ago
- Layout analysis to find layout elements in documents (similar to P2PaLA)☆19Updated this week
- Check your modified Ground Truth files with visual support!☆10Updated last year
- Docker integration of Kitodo.Production and OCR-D☆9Updated last year
- Project DAHN "Digital Edition of historical manuscripts (correspondences)"☆15Updated 6 months ago
- ☆60Updated this week
- ☆35Updated 11 months ago
- A repository for illustrating the transformation of a PAGE XML file into XML-TEI format, resulting from experimentations made for the LEC…☆17Updated 2 years ago
- Conversions between various OCR formats☆77Updated last year
- Digitale Geisteswissenschaften rund um Graphentechnologien☆8Updated 2 months ago
- NLP-helper for OCR-ed pages in PAGE XML format☆10Updated 5 months ago
- ☆32Updated 2 years ago
- DTA Base Format (DTABf)☆18Updated last month