jonathandunn / corpus_similarity
Measure the similarity of text corpora for 74 languages
☆13Updated last year
Alternatives and similar repositories for corpus_similarity:
Users that are interested in corpus_similarity are comparing it to the libraries listed below
- Reference-less Quality Estimation of Text Simplification Systems☆48Updated last year
- PANiC - PAraphrasing Noun-Compounds☆15Updated 6 years ago
- Alignment and annotation for comparable documents.☆22Updated 6 years ago
- linguistic converter / merging tool for multi-level annotated corpora. graph-based (using Python and NetworkX).☆50Updated last year
- List of corpora annotated for coreference for different languages☆17Updated 5 months ago
- ☆32Updated 3 years ago
- The Broad Twitter Corpus, an NER dataset in English stratified for time, location, social media genre, socioeconomic factors (COLING 2016…☆66Updated 2 years ago
- A python module to process data for Frame Semantic Parsing☆23Updated 4 years ago
- A Word Sense Disambiguation system integrating implicit and explicit external knowledge.☆68Updated 3 years ago
- Repository for rstWeb, a browser based annotation interface for Rhetorical Structure Theory☆42Updated 3 months ago
- STREUSLE: a corpus with comprehensive lexical semantic annotation (multiword expressions, supersenses)☆64Updated 2 years ago
- Python library to work with ConceptNet offline☆10Updated last year
- Interface for reading the Paraphrase Database (PPDB)☆24Updated 6 years ago
- Scripts and tools for doing unsupervised acceptability prediction.☆15Updated last year
- Identifying Historical People, Places and other Entities: Shared Task on Named Entity Recognition and Linking on Historical Newspapers at…☆22Updated 5 months ago
- XL-AMR is a sequence-to-graph cross-lingual AMR parser that exploits transfer learning (EMNLP2020).☆16Updated 6 months ago
- Analyze Argumentation and Rhetorical Aspects in Scientific Writing.☆19Updated 2 years ago
- A python module for word inflections designed for use with spaCy.☆92Updated 4 years ago
- The Mueller Report Corpus V 0.1☆11Updated 4 years ago
- A coreference evaluation package for the CoNLL and ARRAU datasets☆40Updated 4 years ago
- PurePos is an open source hybrid morphological tagger.☆16Updated 4 years ago
- ☆33Updated 3 years ago
- Implementation of a simple frame identification approach (SimpleFrameId) described in the paper "Out-of-domain FrameNet Semantic Role Lab…☆15Updated 7 years ago
- Python framework for processing Universal Dependencies data☆55Updated last week
- An Interactive Tool for Annotating Discourse Structure and Text Improvement☆16Updated 3 years ago
- Python Multilingual Ucrel Semantic Analysis System☆31Updated 5 months ago
- Unsupervised method for extracting quotation-speaker pairs from large news corpora.☆29Updated 6 years ago
- An open information extraction system that provides compact extractions☆90Updated 2 years ago
- spaCy pipeline component for adding text readability meta data to Doc objects.☆56Updated 5 years ago
- The Universal Anaphora Scorer☆15Updated 4 months ago