CITlabRostock / citlab-article-separation-newLinks
Modules used for separating articles in (historical) newspapers and similar documents. This repository is part of the European Union's Horizon 2020 project NewsEye. For more information about the project see https://www.newseye.eu/.
☆21Updated 3 years ago
Alternatives and similar repositories for citlab-article-separation-new
Users that are interested in citlab-article-separation-new are comparing it to the libraries listed below
Sorting:
- Convert Transkribus PAGE-XML to standard PAGE-XML☆12Updated last year
- An extensible viewer for OCR-D mets.xml files☆21Updated last year
- ☆13Updated 3 years ago
- Transkriptionen von Fibeln (19. Jahrhundert)☆11Updated last year
- Training data from "Hauptphase I" of project "Digitalisierung historischer deutscher Zeitungen"☆12Updated 3 years ago
- Convert PAGE (v. 2019) to ALTO (v. 2.0 - 4.2)☆14Updated 4 months ago
- Tools for normalizing the use of some characters and checking file consistencies☆11Updated 8 months ago
- Named entity annotation tool☆28Updated 2 years ago
- Some bits of javascript to transcribe scanned pages using PageXML☆17Updated last year
- Small collection of PAGE XML related scripts used at the ZPD Würzburg☆13Updated last year
- Reichsanzeiger-NLP: NER/NEL corpus for the German historical newspaper "Deutscher Reichsanzeiger und Preußischer Staatsanzeiger" (1819–19…☆16Updated 11 months ago
- OCRopus model for Gothic print (Fraktur)☆18Updated 5 years ago
- You Actually Look Twice At it☆35Updated 7 months ago
- Layout analysis to find layout elements in documents (similar to P2PaLA)☆18Updated last week
- Check your modified Ground Truth files with visual support!☆10Updated last year
- ☆61Updated last week
- Extract the MODS/ALTO metadata of a bunch of METS/ALTO files into pandas DataFrames for data analysis☆12Updated 3 weeks ago
- Named Entity Recognition☆18Updated 5 months ago
- Python tools for performing various operations on ALTO XML files☆48Updated 6 months ago
- A Pythonic API and some command line tools to access the Transkribus server via its REST API☆27Updated 2 years ago
- ☆31Updated 3 weeks ago
- Training files for Greek cursive script (in early print)☆15Updated 4 years ago
- A repository for illustrating the transformation of a PAGE XML file into XML-TEI format, resulting from experimentations made for the LEC…☆16Updated 3 years ago
- Conversions between various OCR formats☆80Updated 2 years ago
- Highlighting various OCR formats directly in Solr☆86Updated last week
- Library in C++ and a python wrapper for dealing with Page XML files☆13Updated 4 months ago
- A repository for online OCRD training infrastructure.☆13Updated 5 years ago
- Named Entity Recognition tool for Europeana Newspapers☆14Updated 7 years ago
- An OCR evaluation tool☆66Updated 3 weeks ago
- A suite of batches and tools for OCR tasks.☆71Updated 2 years ago