INL / OpenConvert
Text conversion tool (from e.g. Word, HTML, txt) to corpus formats TEI or FoLiA)
☆23Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for OpenConvert
- nnanno is a collection of tools that sample, annotate and apply computer vision to the Newspaper Navigator dataset☆17Updated last month
- Convert between Tesseract hOCR and ALTO XML using XSL stylesheets☆53Updated 4 months ago
- Named entity annotation tool☆27Updated last year
- Named Entity Recognition tool for Europeana Newspapers☆14Updated 6 years ago
- Data Mining Historical Newspaper Metadata (METS/ALTO formats)☆24Updated 2 years ago
- Graph-based tool for disambiguation and linking of named entities to Linked Data sets for Digital Humanities and heritage texts☆27Updated 3 years ago
- OCRopus model for Gothic print (Fraktur)☆18Updated 4 years ago
- Tutorial on NE processing for Digital Humanities - DH Utrech 2019☆25Updated 5 years ago
- ☆24Updated 3 years ago
- Efficient indexing and retrieval of OCR bounding boxes in Solr☆22Updated 5 years ago
- CollateX – Software for Collating Textual Sources☆90Updated 8 months ago
- Edition Visualization Technology 2 - development☆75Updated 9 months ago
- Brucheion is a Virtual Research Environment (VRE) to create Linked Open Data (LOD) for historical languages and the research of historica…☆14Updated last year
- Java based viewer for PAGE XML files (layout + text content). Also supports ALTO XML, FineReader XML, and HOCR.☆35Updated last year
- Web application for transcribing OCR ground truth from Archive.org☆17Updated 6 years ago
- Kiln is a multi-platform framework for building and deploying complex websites whose source content is primarily in XML. It brings togeth…☆34Updated 2 years ago
- High-performance text aligner for large collections of texts☆45Updated last month
- Exercises for the XQuery Workshops at XQuery at DH2017☆47Updated 6 years ago
- A deep learning architecture for reference mining from literature in the arts and humanities.☆15Updated 5 years ago
- EFES (EpiDoc Front End Services) is a custom and readily customizable platform for publication and search/indexing of EpiDoc files, based…☆31Updated 5 months ago
- Automatically exported from code.google.com/p/oxygen-tei☆15Updated last week
- Repository for the book Among Digitized Manuscripts by L.W. Cornelis van Lit (Leiden: Brill, 2020)☆20Updated 4 years ago
- Node.JS/Browser Web Annotation Framework☆12Updated last year
- Multi Tier Annotation Search☆12Updated 6 months ago
- Heidelberg Monograph PublishingTool (heiMPT) is a stand-alone platform, as well as a plug-in application for OMP. It enables a high degre…☆22Updated 2 years ago
- A CLI tool that generates IIIF Presentation 2.1 Manifests from METS/MODS☆23Updated 2 months ago
- ☆10Updated 2 months ago
- Extract the MODS/ALTO metadata of a bunch of METS/ALTO files into pandas DataFrames for data analysis☆11Updated 3 months ago
- An implementation of the TEI Simple ODD extensions for processing models in XQuery.☆22Updated 5 years ago
- Python tools for performing various operations on ALTO XML files☆39Updated last year