CITlabRostock / citlab-article-separation-newLinks
Modules used for separating articles in (historical) newspapers and similar documents. This repository is part of the European Union's Horizon 2020 project NewsEye. For more information about the project see https://www.newseye.eu/.
☆20Updated 2 years ago
Alternatives and similar repositories for citlab-article-separation-new
Users that are interested in citlab-article-separation-new are comparing it to the libraries listed below
Sorting:
- Convert Transkribus PAGE-XML to standard PAGE-XML☆12Updated 11 months ago
- Named entity annotation tool☆28Updated last year
- You Actually Look Twice At it☆35Updated 4 months ago
- Small collection of PAGE XML related scripts used at the ZPD Würzburg☆13Updated 10 months ago
- Training data from "Hauptphase I" of project "Digitalisierung historischer deutscher Zeitungen"☆12Updated 3 years ago
- Tools for normalizing the use of some characters and checking file consistencies☆11Updated 4 months ago
- Convert PAGE (v. 2019) to ALTO (v. 2.0 - 4.2)☆14Updated 3 weeks ago
- ☆13Updated 2 years ago
- Reichsanzeiger-NLP: NER/NEL corpus for the German historical newspaper "Deutscher Reichsanzeiger und Preußischer Staatsanzeiger" (1819–19…☆16Updated 7 months ago
- OCRopus model for Gothic print (Fraktur)☆18Updated 5 years ago
- Some bits of javascript to transcribe scanned pages using PageXML☆17Updated last year
- A Pythonic API and some command line tools to access the Transkribus server via its REST API☆27Updated 2 years ago
- ☆60Updated 2 weeks ago
- Transkriptionen von Fibeln (19. Jahrhundert)☆11Updated last year
- An extensible viewer for OCR-D mets.xml files☆21Updated last year
- Docker integration of Kitodo.Production and OCR-D☆9Updated last year
- Named Entity Recognition☆19Updated last month
- Layout analysis to find layout elements in documents (similar to P2PaLA)☆19Updated this week
- Tutorial on NE processing for Digital Humanities - DH Utrech 2019☆25Updated 5 years ago
- A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.☆23Updated last year
- Java based viewer for PAGE XML files (layout + text content). Also supports ALTO XML, FineReader XML, and HOCR.☆35Updated 2 years ago
- METS/ALTO OCR enhancing tool by the National Library of Luxembourg (BnL)☆53Updated 2 years ago
- Convert between Tesseract hOCR and ALTO XML using XSL stylesheets☆55Updated 2 weeks ago
- Python tools for performing various operations on ALTO XML files☆47Updated 3 months ago
- Check your modified Ground Truth files with visual support!☆10Updated last year
- Conversions between various OCR formats☆77Updated 2 years ago
- Converters for various file formats used for representing OCR☆12Updated last month
- The CIS OCR PostCorrectionTool☆42Updated 2 years ago
- OCR-D wrapper for prima-pagetopdf☆9Updated 2 weeks ago
- Extract the MODS/ALTO metadata of a bunch of METS/ALTO files into pandas DataFrames for data analysis☆11Updated this week