lizfischer / document-segmentation
Browser-based app for segmenting & OCRing PDF pages based on whitespace rules. To assist researchers (especially in the humanities) with turning their materials into machine-actionable datasets.
☆11Updated last year
Alternatives and similar repositories for document-segmentation:
Users that are interested in document-segmentation are comparing it to the libraries listed below
- A framework for Oxygen XML Editor allowing researchers to transcribe historical documents in TEI☆21Updated 9 months ago
- Srophé Application. A TEI publishing application.☆17Updated 4 months ago
- Repository for the book Among Digitized Manuscripts by L.W. Cornelis van Lit (Leiden: Brill, 2020)☆22Updated 5 years ago
- Web application to build XML stand-off markup☆15Updated 4 years ago
- Automated listing of repos in GitHub with XML files containing teiHeader. Find a project using TEI today!☆16Updated this week
- A codebase to support a pure JSON search engine requiring no backend for any XHTML5 document collection☆52Updated 3 weeks ago
- Data Mining Historical Newspaper Metadata (METS/ALTO formats)☆25Updated 2 years ago
- A standalone React/Redux web application for for presenting unique printed books and manuscripts in digital facsimile.☆31Updated 2 years ago
- Exercises for the XQuery Workshops at XQuery at DH2017☆50Updated 6 years ago
- An open source online storytelling platform for everyone. Built by Cogapp.☆27Updated last month
- Digital Mappa (DM for short) is a freely available online environment for creating projects out of digital images and texts.☆21Updated 4 months ago
- Self hosting code for Recogito-Studio☆17Updated last week
- Brucheion is a Virtual Research Environment (VRE) to create Linked Open Data (LOD) for historical languages and the research of historica…☆14Updated 2 years ago
- Special Topics in AI: Artificial Intelligence as an Archival Science☆16Updated 10 months ago
- Awesome AI in Libraries☆16Updated last year
- Best Practices for TEI in Libraries: A guide for mass digitization, automated workflows, and promotion of interoperability with XML using…☆31Updated 6 years ago
- Discovering IIIF manifests☆18Updated last year
- A static site generator for TEI Publisher☆12Updated 3 years ago
- The main TEI Publisher app☆71Updated 2 weeks ago
- Instructions, exercises and example data sets for Annif hands-on tutorial☆40Updated last month
- Locolligo is a single-page, browser-based javascript application to facilitate the formatting, linking, and geolocation of datasets, with…☆14Updated last year
- A highly customizable plugin for setting up and activating remote-driven autocompletions of attribute values in the oXygen XML Editor.☆19Updated 7 months ago
- Heritage Connector: Transforming text into data to extract meaning and make connections☆24Updated 2 years ago
- Oral History/Qualitative Interview Data Analysis and Publication Tool☆19Updated last year
- Data space of the DARIAH Lexical Resources Working Group☆21Updated last week
- Modeling and visualizing physical manuscript collation☆50Updated 2 years ago
- LD4P Sinopia Project repo to hold docs, general issues, schemas, and related spec docs.☆21Updated last year
- EFES (EpiDoc Front End Services) is a custom and readily customizable platform for publication and search/indexing of EpiDoc files, based…☆30Updated last month
- CollateX – Software for Collating Textual Sources☆92Updated last year
- The Jupyter Book is aimed at historians who are looking for a first interactive introduction to the Python programming language in German…☆14Updated 2 years ago