CyberZHG / wiki-dump-readerLinks
Extract corpora from Wikipedia dumps
☆25Updated 6 years ago
Alternatives and similar repositories for wiki-dump-reader
Users that are interested in wiki-dump-reader are comparing it to the libraries listed below
Sorting:
- Statistics on multilingual datasets☆17Updated 2 years ago
- Codebase for probing and visualizing multilingual models.☆49Updated 5 years ago
- A program to choose transfer languages for cross-lingual learning☆72Updated 2 years ago
- Code for paper "Neural Semi-Markov Conditional Random Fields for Robust Character-Based Part-of-Speech Tagging"☆16Updated 6 years ago
- Language Modelling Makes Sense - WSD (and more) with Contextual Embeddings☆95Updated 2 years ago
- Frame-Semantic and PropBank Semantic Role Labeling with Syntactic Scaffolding.☆50Updated 4 years ago
- UFSAC is a resource containing all WordNet Sense Annotated Corpora, and a Java library for manipulating them☆38Updated 3 years ago
- GC4LM: A Colossal (Biased) language model for German☆13Updated 4 years ago
- ☆32Updated 4 years ago
- This repository contains the code for "BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Representations".☆64Updated 4 years ago
- Twpipe is a pipeline toolkit that parses raw tweets into universal dependencies.☆28Updated 6 years ago
- ☆24Updated 5 years ago
- The Universal Decompositional Semantics (UDS) dataset and the Decomp toolkit☆57Updated 2 years ago
- Self-supervised NER prototype - updated version (69 entity types - 17 broad entity groups). Uses pretrained BERT models with no fine tuni…☆77Updated 2 years ago
- Cross-lingual TRansfer Evaluation of Multilingual Encoders (XTREME)☆22Updated 5 years ago
- Massively Multilingual Transfer for NER☆86Updated 3 years ago
- Auxiliary GAN for WE post-specialisation☆23Updated 6 years ago
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/☆87Updated last month
- A coreference evaluation package for the CoNLL and ARRAU datasets☆40Updated 4 years ago
- COLING 2018 Tutorial on Multilingual FrameNet: Automatic semantic role labeling for FrameNet☆25Updated 6 years ago
- The implementation of "Neural Machine Translation without Embeddings", NAACL 2021☆33Updated 4 years ago
- This repository contains the code for the Form-Context Model and its Attentive Mimicking variant.☆31Updated 5 years ago
- SUM-QE, a BERT-based Summary Quality Estimation Model☆21Updated last year
- LongSumm - Scientific Document Summarization Task☆74Updated 2 years ago
- PyTorch code for the EMNLP 2020 paper "Embedding Words in Non-Vector Space with Unsupervised Graph Learning"☆41Updated 4 years ago
- Assessing syntactic abilities of BERT☆148Updated 6 years ago
- codebase for the Text-based NP Enrichment (TNE) paper☆19Updated last year
- A Word Sense Disambiguation system integrating implicit and explicit external knowledge.☆69Updated 3 years ago
- Word Sense Induction with BERT MLM☆28Updated last year
- Data and code for Kang et al., EMNLP 2019's paper titled "(Male, Bachelor) and (Female, Ph.D) have different connotations: Parallelly Ann…☆29Updated 5 years ago