YaleDHLab / intertext
Detect and visualize text reuse
☆118Updated 7 months ago
Alternatives and similar repositories for intertext:
Users that are interested in intertext are comparing it to the libraries listed below
- SEM, a free NLP tool relying on machine learning technologies, especially CRFs.☆24Updated 3 years ago
- A trend viewer written in Python/JavaScript☆21Updated 5 months ago
- A Python library for topic modeling and visualization☆65Updated 4 years ago
- High-performance text aligner for large collections of texts☆51Updated last week
- Tutorial on NE processing for Digital Humanities - DH Utrech 2019☆25Updated 5 years ago
- German lemmatization with IWNLP as extension for spaCy☆24Updated last year
- Visualize large text collections with WebGL☆25Updated 7 months ago
- Graph-based tool for disambiguation and linking of named entities to Linked Data sets for Digital Humanities and heritage texts☆27Updated 3 years ago
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidata☆94Updated 2 years ago
- Detect and align similar passages☆100Updated 2 months ago
- A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.☆23Updated last year
- Python package for harvesting records from OAI-PMH provider(s).☆62Updated 2 years ago
- A Flexible Deep Learning Approach to Fuzzy String Matching☆145Updated 6 months ago
- HuCit KB: a knowledge base of classical texts and citable text units.☆11Updated 3 years ago
- A part-of-speech tagger with support for domain adaptation and external resources.☆22Updated 2 years ago
- A visualisation tool for Spacy using Hierplane.☆65Updated 2 years ago
- Humanities Entity Recognition: robust, practical, efficient Named Entity Recognition for today's digital humanist☆36Updated 6 years ago
- A Named-Entity Recogniser based on Grobid.☆52Updated 7 months ago
- A lemmatizer for German language text☆89Updated 2 years ago
- A Python module for interfacing with the Treetagger by Helmut Schmid.☆75Updated 3 years ago
- spaCy pipeline component for adding text readability meta data to Doc objects.☆56Updated 6 years ago
- A software to detect text reuse with BLAST.☆14Updated 5 years ago
- 📂 Additional lookup tables and data resources for spaCy☆106Updated 3 months ago
- Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissions☆19Updated 2 years ago
- Natural language processing resources for multiple languages, with an eye towards use for digital humanities.☆126Updated 3 years ago
- Python port for IWNLP.Lemmatizer☆17Updated last year
- A set of workflows for corpus building through OCR, post-correction and normalisation☆48Updated 2 years ago
- linguistics backend☆41Updated 2 years ago
- Named Entities Recognition Annotator Tool for Europeana Newspapers☆60Updated 7 years ago
- Python package for stylometry☆63Updated 4 years ago