CITlabRostock / citlab-article-separation-newLinks
Modules used for separating articles in (historical) newspapers and similar documents. This repository is part of the European Union's Horizon 2020 project NewsEye. For more information about the project see https://www.newseye.eu/.
☆22Updated 3 years ago
Alternatives and similar repositories for citlab-article-separation-new
Users that are interested in citlab-article-separation-new are comparing it to the libraries listed below
Sorting:
- Tools for normalizing the use of some characters and checking file consistencies☆11Updated 9 months ago
- Convert Transkribus PAGE-XML to standard PAGE-XML☆12Updated last year
- Convert PAGE (v. 2019) to ALTO (v. 2.0 - 4.2)☆14Updated 5 months ago
- Transkriptionen von Fibeln (19. Jahrhundert)☆11Updated last year
- Some bits of javascript to transcribe scanned pages using PageXML☆17Updated last year
- An extensible viewer for OCR-D mets.xml files☆22Updated last year
- ☆14Updated 3 years ago
- Training data from "Hauptphase I" of project "Digitalisierung historischer deutscher Zeitungen"☆12Updated 3 years ago
- Small collection of PAGE XML related scripts used at the ZPD Würzburg☆12Updated last year
- Named entity annotation tool☆28Updated 2 years ago
- You Actually Look Twice At it☆36Updated 9 months ago
- Check your modified Ground Truth files with visual support!☆10Updated last year
- ☆61Updated 2 weeks ago
- Reichsanzeiger-NLP: NER/NEL corpus for the German historical newspaper "Deutscher Reichsanzeiger und Preußischer Staatsanzeiger" (1819–19…☆16Updated last year
- A repository for online OCRD training infrastructure.☆13Updated 5 years ago
- Extract the MODS/ALTO metadata of a bunch of METS/ALTO files into pandas DataFrames for data analysis☆12Updated 2 months ago
- A Pythonic API and some command line tools to access the Transkribus server via its REST API☆27Updated 2 years ago
- ☆32Updated 2 months ago
- OCRopus model for Gothic print (Fraktur)☆18Updated 5 years ago
- Training files for Greek cursive script (in early print)☆15Updated 4 years ago
- Python tools for performing various operations on ALTO XML files☆48Updated 8 months ago
- Layout analysis to find layout elements in documents (similar to P2PaLA)☆20Updated last week
- Fork of dhSegment for experiments on visual and textual feature combination.☆15Updated 4 years ago
- Command line tool to convert page layout files to the latest PAGE XML format. It supports all previous versions of the PAGE format as wel…☆23Updated 4 years ago
- An OCR evaluation tool☆68Updated 2 months ago
- Conversions between various OCR formats☆81Updated 2 years ago
- Master repository which includes most other OCR-D repositories as submodules☆72Updated 3 months ago
- Java command line tool to convert PAGE XML files with layout and text content to PDF☆10Updated 5 years ago
- Java based viewer for PAGE XML files (layout + text content). Also supports ALTO XML, FineReader XML, and HOCR.☆35Updated 2 years ago
- Named Entity Recognition☆18Updated 6 months ago