CyberZHG / wiki-dump-readerLinks
Extract corpora from Wikipedia dumps
☆26Updated 6 years ago
Alternatives and similar repositories for wiki-dump-reader
Users that are interested in wiki-dump-reader are comparing it to the libraries listed below
Sorting:
- XAI Tutorial for the Explainable AI track in the ALPS winter school 2021☆56Updated 4 years ago
- Main repository for "CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters"☆199Updated 2 years ago
- The Benchmark of Linguistic Minimal Pairs☆161Updated 3 years ago
- Diagnostic tests for linguistic capacities in language models☆65Updated 3 years ago
- ☆21Updated 5 years ago
- Code to reproduce the experiments from the paper.☆103Updated 2 years ago
- Multi-Annotator Competence Estimation tool☆134Updated 2 weeks ago
- Code and data for: Low Resource Grammatical Error Correction Using Wikipedia Edits (WNUT 2018)☆17Updated last year
- Lexical Simplification with Pretrained Encoders☆70Updated 5 years ago
- This is the reference implementation of commonly used coreference metrics.☆76Updated 7 years ago
- MT Evaluation in Many Languages via Zero-Shot Paraphrasing☆102Updated last year
- ESC: Redesigning WSD with Extractive Sense Comprehension☆26Updated 4 years ago
- STREUSLE: a corpus with comprehensive lexical semantic annotation (multiword expressions, supersenses)☆69Updated 2 months ago
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/☆90Updated this week
- UFSAC is a resource containing all WordNet Sense Annotated Corpora, and a Java library for manipulating them☆38Updated 3 years ago
- Scientific Document Summarization Corpus and Annotations from the WING NUS group.☆215Updated 2 years ago
- Language Modelling Makes Sense - WSD (and more) with Contextual Embeddings☆96Updated 2 years ago
- A single model that parses Universal Dependencies across 75 languages. Given a sentence, jointly predicts part-of-speech tags, morphology…☆225Updated 3 years ago
- Disambiguate is a tool for training and using state of the art neural WSD models☆60Updated 6 months ago
- DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models☆157Updated 3 years ago
- Easier Automatic Sentence Simplification Evaluation☆166Updated 2 years ago
- This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.☆76Updated 2 years ago
- Datasets for the Monolingual Word Sense Alignment (MWSA) task☆12Updated 5 years ago
- A Word Sense Disambiguation system integrating implicit and explicit external knowledge.☆69Updated 4 years ago
- ☆21Updated 5 years ago
- Efficient Low-Memory Aligner☆146Updated last year
- Massively Multilingual Transfer for NER☆86Updated 4 years ago
- Benchmarks for intrinsic word embeddings evaluation.☆66Updated 7 years ago
- This repository houses the IMPlicature and PRESupposition diagnostic dataset (IMPPRES), consisting of >25k semiautomatically generated se…☆19Updated 4 years ago
- Experiment code for the ACL 2020 paper "Moving Down the Long Tail of Word Sense Disambiguation with Gloss Informed Bi-encoders".☆53Updated 2 years ago