CITlabRostock / citlab-article-separation-new
Modules used for separating articles in (historical) newspapers and similar documents. This repository is part of the European Union's Horizon 2020 project NewsEye. For more information about the project see https://www.newseye.eu/.
☆20Updated 2 years ago
Alternatives and similar repositories for citlab-article-separation-new:
Users that are interested in citlab-article-separation-new are comparing it to the libraries listed below
- Convert Transkribus PAGE-XML to standard PAGE-XML☆12Updated 8 months ago
- You Actually Look Twice At it☆31Updated 2 months ago
- Layout analysis to find layout elements in documents (similar to P2PaLA)☆18Updated this week
- Small collection of PAGE XML related scripts used at the ZPD Würzburg☆13Updated 7 months ago
- Training data from "Hauptphase I" of project "Digitalisierung historischer deutscher Zeitungen"☆12Updated 3 years ago
- Tools for normalizing the use of some characters and checking file consistencies☆11Updated 2 months ago
- ☆12Updated 2 years ago
- An extensible viewer for OCR-D mets.xml files☆20Updated 9 months ago
- Some bits of javascript to transcribe scanned pages using PageXML☆17Updated last year
- Named entity annotation tool☆27Updated last year
- A Pythonic API and some command line tools to access the Transkribus server via its REST API☆27Updated 2 years ago
- OCR-D wrapper for prima-pagetopdf☆9Updated last week
- Convert PAGE (v. 2019) to ALTO (v. 2.0 - 4.2)☆14Updated 2 weeks ago
- Extract the MODS/ALTO metadata of a bunch of METS/ALTO files into pandas DataFrames for data analysis☆11Updated 3 months ago
- ☆58Updated last month
- Transkriptionen von Fibeln (19. Jahrhundert)☆11Updated last year
- OCRopus model for Gothic print (Fraktur)☆18Updated 5 years ago
- Reichsanzeiger-NLP: NER/NEL corpus for the German historical newspaper "Deutscher Reichsanzeiger und Preußischer Staatsanzeiger" (1819–19…☆15Updated 5 months ago
- Project DAHN "Digital Edition of historical manuscripts (correspondences)"☆15Updated 4 months ago
- Python tools for performing various operations on ALTO XML files☆45Updated 3 weeks ago
- Pipeline for the production of digital scholarly editions of archival collections☆12Updated last year
- ☆31Updated 2 months ago
- Docker integration of Kitodo.Production and OCR-D☆9Updated last year
- CERberus -- guardian against character errors☆28Updated last year
- Check your modified Ground Truth files with visual support!☆10Updated last year
- Java based viewer for PAGE XML files (layout + text content). Also supports ALTO XML, FineReader XML, and HOCR.☆35Updated last year
- A repository for illustrating the transformation of a PAGE XML file into XML-TEI format, resulting from experimentations made for the LEC…☆17Updated 2 years ago
- Conversions between various OCR formats☆74Updated last year
- Java command line tool to convert PAGE XML files with layout and text content to PDF☆10Updated 4 years ago
- Augment line images for improving OCR datasets☆9Updated last year