Modules used for separating articles in (historical) newspapers and similar documents. This repository is part of the European Union's Horizon 2020 project NewsEye. For more information about the project see https://www.newseye.eu/.
☆22Sep 2, 2022Updated 3 years ago
Alternatives and similar repositories for citlab-article-separation-new
Users that are interested in citlab-article-separation-new are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- OCRopus model for Gothic print (Fraktur)☆19Feb 16, 2020Updated 6 years ago
- Training data from "Hauptphase I" of project "Digitalisierung historischer deutscher Zeitungen"☆12Dec 17, 2021Updated 4 years ago
- Convert PAGE (v. 2019) to ALTO (v. 2.0 - 4.2)☆15Jan 20, 2026Updated 2 months ago
- convert PubLayNet data into METS/PAGE-XML☆10Mar 17, 2020Updated 6 years ago
- Layout analysis to find layout elements in documents (similar to P2PaLA)☆20Mar 24, 2026Updated 3 weeks ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Convert Transkribus PAGE-XML to standard PAGE-XML☆12Dec 10, 2025Updated 4 months ago
- OCR-D post-correction module based on weighted finite-state transducers☆11Jan 13, 2024Updated 2 years ago
- Reichsanzeiger-NLP: NER/NEL corpus for the German historical newspaper "Deutscher Reichsanzeiger und Preußischer Staatsanzeiger" (1819–19…☆16Oct 18, 2024Updated last year
- A Pythonic API and some command line tools to access the Transkribus server via its REST API☆28Nov 25, 2022Updated 3 years ago
- An extensible viewer for OCR-D mets.xml files☆23May 30, 2024Updated last year
- Specification of the @OCR-D technical architecture, interface definitions and data exchange format(s)☆17Sep 18, 2025Updated 6 months ago
- texrex web page cleaning & ClaraX random walk crawler☆11Dec 13, 2021Updated 4 years ago
- Master repository which includes most other OCR-D repositories as submodules☆73Jul 4, 2025Updated 9 months ago
- Recognize text using Calamari OCR and the OCR-D framework☆16May 13, 2025Updated 11 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Transkriptionen von Fibeln (19. Jahrhundert)☆11Oct 31, 2025Updated 5 months ago
- Double-checked Gold Standard Data for Training and Testing OCR Engines☆21Dec 31, 2022Updated 3 years ago
- OCR-D wrapper for detectron2 based segmentation models☆17May 1, 2025Updated 11 months ago
- Web application for transcribing OCR ground truth from Archive.org☆17Feb 22, 2018Updated 8 years ago
- Some bits of javascript to transcribe scanned pages using PageXML☆17Mar 18, 2024Updated 2 years ago
- A repository for online OCRD training infrastructure.☆13Aug 20, 2020Updated 5 years ago
- GC4LM: A Colossal (Biased) language model for German☆13May 2, 2021Updated 4 years ago
- DFKI Layout Detection for OCR-D☆47May 1, 2025Updated 11 months ago
- ☆68Mar 23, 2026Updated 3 weeks ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- OCR-D python tools☆33Aug 16, 2024Updated last year
- OCR-D post-correction with encoder-attention-decoder LSTMs☆13May 1, 2025Updated 11 months ago
- Django web application to display, annotate, and export digitized books.☆33Apr 7, 2026Updated last week
- ALTO XML schema - latest and all former versions☆55Jan 20, 2026Updated 2 months ago
- ☆10Jan 22, 2023Updated 3 years ago
- Converters for various file formats used for representing OCR☆12Apr 30, 2025Updated 11 months ago
- OCR-D compliant toolset for optical layout recognition on historical german-language documents published in Brazil☆11Sep 24, 2021Updated 4 years ago
- Neural models for detecting and masking personal information from texts☆16Nov 25, 2022Updated 3 years ago
- A CLI tool that generates IIIF Presentation 2.1 Manifests from METS/MODS☆24Apr 17, 2025Updated 11 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Tools for TICCL☆14Dec 12, 2025Updated 4 months ago
- Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format☆46Mar 31, 2025Updated last year
- Identifying Historical People, Places and other Entities: Shared Task on Named Entity Recognition and Linking on Historical Newspapers at…☆21Aug 1, 2024Updated last year
- Tutorials zu Schnittstellen und beispielhaften Datenanalysen☆29Feb 21, 2026Updated last month
- Check your modified Ground Truth files with visual support!☆10Jan 31, 2024Updated 2 years ago
- A python module for evaluating NERC and NEL system performances as defined in the HIPE shared tasks (formerly CLEF-HIPE-2020-scorer).☆15Jun 4, 2024Updated last year
- Norwegian Speech Transformer Models☆19Mar 26, 2026Updated 2 weeks ago