Data Mining Historical Newspaper Metadata (METS/ALTO formats)
☆25Feb 6, 2026Updated 2 months ago
Alternatives and similar repositories for EN-data_mining
Users that are interested in EN-data_mining are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Conversions between various OCR formats☆84Feb 13, 2026Updated last month
- Awesome AI in Libraries☆17Jul 21, 2023Updated 2 years ago
- convert NDNP data to IIIF☆12Jun 7, 2016Updated 9 years ago
- ☆11Jul 18, 2016Updated 9 years ago
- Extract the MODS/ALTO metadata of a bunch of METS/ALTO files into pandas DataFrames for data analysis☆13Aug 21, 2025Updated 7 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- OCR-D post-correction module based on weighted finite-state transducers☆11Jan 13, 2024Updated 2 years ago
- Vue-based Web Component for creating narrative presentations of images and maps☆15May 1, 2025Updated 11 months ago
- TIFY is a slim and mobile-friendly IIIF document viewer.☆124Mar 25, 2026Updated 2 weeks ago
- Text Corpus of African American Fiction and Poetry, from 1853-1923☆11Aug 5, 2020Updated 5 years ago
- Tentative way towards a shared API for prosopographical data based on the factoid model (Bradley/Short 2005)☆24Aug 25, 2022Updated 3 years ago
- An extensible viewer for OCR-D mets.xml files☆23May 30, 2024Updated last year
- OCRopus model for Gothic print (Fraktur)☆19Feb 16, 2020Updated 6 years ago
- METS/ALTO OCR enhancing tool by the National Library of Luxembourg (BnL)☆56May 30, 2023Updated 2 years ago
- Named Entity Recognition tool for Europeana Newspapers☆14Apr 5, 2018Updated 8 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- A collection of notebooks for Natural Language Processing☆25Jan 13, 2025Updated last year
- Web-based page layout editor created for EMOP (Early Modern OCR Project).☆11May 21, 2021Updated 4 years ago
- Java based viewer for PAGE XML files (layout + text content). Also supports ALTO XML, FineReader XML, and HOCR.☆35May 25, 2023Updated 2 years ago
- 'ocr-evaluation-tools' from http://ancientgreekocr.org/. Tools to test OCR accuracy.☆23Feb 21, 2018Updated 8 years ago
- OCR-D python tools☆33Aug 16, 2024Updated last year
- A simple IIIF and Mirador compatible Annotation Server☆102Mar 1, 2026Updated last month
- IIIF Examples and useful code☆20Sep 10, 2025Updated 7 months ago
- Graph-based tool for disambiguation and linking of named entities to Linked Data sets for Digital Humanities and heritage texts☆28Sep 20, 2021Updated 4 years ago
- (ICFHR 2020 oral) Code for "docExtractor: An off-the-shelf historical document element extraction" paper☆88May 25, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- CERberus -- guardian against character errors☆29Feb 15, 2024Updated 2 years ago
- version 4.x of the Princeton Geniza Project☆12Apr 2, 2026Updated last week
- Web service for creating and hosting IIIF manifests from METS/MODS documents☆36Dec 8, 2022Updated 3 years ago
- Computational Historical Thinking: With Applications in R☆61Mar 23, 2020Updated 6 years ago
- R/Shiny application for viewing microbial community data☆19Dec 17, 2020Updated 5 years ago
- Terminal tool that converts files encoding to UTF-8☆10Oct 5, 2019Updated 6 years ago
- ☆15May 19, 2020Updated 5 years ago
- Flask-IIIF is permitting easy integration with the International Image Interoperability Framework (IIIF) API standards.☆27Jan 27, 2026Updated 2 months ago
- Core libraries by the PRImA Research Lab☆16Jul 30, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Package to standardize country names and set Correlate of War IDs☆12Mar 7, 2023Updated 3 years ago
- a geographical visualization☆12Mar 28, 2018Updated 8 years ago
- This project collects the map assets (Shapefiles and GeoJSON) that were used for the "Manifest Destiny" Visualization (http://michaelpora…☆24Oct 25, 2012Updated 13 years ago
- Get insights from OrientDB database using PyOrient through IBM Watson Studio☆13Apr 22, 2019Updated 6 years ago
- A CLI tool that generates IIIF Presentation 2.1 Manifests from METS/MODS☆24Apr 17, 2025Updated 11 months ago
- Finnish Meteorological Institute open data API R client☆10Sep 9, 2019Updated 6 years ago
- ARCHIVED Extract Text from 'PDFs'☆21May 10, 2022Updated 3 years ago