YaleDHLab / intertext
Detect and visualize text reuse
☆118Updated 6 months ago
Alternatives and similar repositories for intertext:
Users that are interested in intertext are comparing it to the libraries listed below
- A Python library for topic modeling and visualization☆65Updated 4 years ago
- Tutorial on NE processing for Digital Humanities - DH Utrech 2019☆25Updated 5 years ago
- Detect and align similar passages☆98Updated last month
- Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissions☆19Updated last year
- SEM, a free NLP tool relying on machine learning technologies, especially CRFs.☆24Updated 3 years ago
- High-performance text aligner for large collections of texts☆49Updated 4 months ago
- METS/ALTO OCR enhancing tool by the National Library of Luxembourg (BnL)☆53Updated last year
- Trying to generate name synonyms from wikidata☆32Updated 4 years ago
- Python port for IWNLP.Lemmatizer☆17Updated last year
- Python package for harvesting records from OAI-PMH provider(s).☆62Updated 2 years ago
- Visualize large text collections with WebGL☆25Updated 6 months ago
- A Python module for interfacing with the Treetagger by Helmut Schmid.☆75Updated 3 years ago
- A digital humanities operating system that runs on a USB disk.☆31Updated 7 years ago
- A set of workflows for corpus building through OCR, post-correction and normalisation☆48Updated 2 years ago
- A trend viewer written in Python/JavaScript☆21Updated 3 months ago
- Graph-based tool for disambiguation and linking of named entities to Linked Data sets for Digital Humanities and heritage texts☆27Updated 3 years ago
- Natural language processing resources for multiple languages, with an eye towards use for digital humanities.☆126Updated 3 years ago
- Named-Entity Recognition extension for Google Refine / OpenRefine☆72Updated 7 years ago
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidata☆94Updated last year
- Repository for "Towards Robust Named Entity Recognition for Historic German"☆18Updated 4 years ago
- CLI for loading Wikidata subsets (or all of it) into Elasticsearch☆70Updated 3 years ago
- A lemmatizer for German language text☆88Updated 2 years ago
- PAGE XML format collection for document image page content and more☆67Updated 3 years ago
- Functional and structural analysis of tables in research papers (Table disentangling)☆20Updated 7 years ago
- A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.☆23Updated last year
- Deutsches Lyrik Korpus (DLK) / German Poetry Corpus☆18Updated 9 months ago
- Python tools for performing various operations on ALTO XML files☆45Updated 2 weeks ago
- frontend app for our Digital Journal☆22Updated this week
- Workshop materials for our DH2018 workshop on word vectors. Created by Eun Seo Jo, Javier de la Rosa, and Scott Bailey☆15Updated 6 years ago
- Multi Tier Annotation Search☆26Updated 3 years ago