avi-otterai / SWOW-eval
Intrinsic Evaluation of pre-trained word embeddings, using large Word Association Dataset: SWOW (Small World of Words)
☆11Updated last year
Alternatives and similar repositories for SWOW-eval:
Users that are interested in SWOW-eval are comparing it to the libraries listed below
- This repository contains the code for the Form-Context Model and its Attentive Mimicking variant.☆31Updated 4 years ago
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆26Updated 3 years ago
- Statistics on multilingual datasets☆17Updated 2 years ago
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/☆86Updated 2 weeks ago
- This repository contains materials for the SIGIR 2022 tutorial on opinion summarization.☆34Updated 2 years ago
- Code for the paper "Measuring Bias in Contextualized Word Representations"☆35Updated 5 years ago
- A Word Sense Disambiguation system integrating implicit and explicit external knowledge.☆69Updated 3 years ago
- ☆16Updated 5 years ago
- ☆38Updated 4 years ago
- XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning☆102Updated 4 years ago
- Official codebase accompanying our ACL 2022 paper "RELiC: Retrieving Evidence for Literary Claims" (https://relic.cs.umass.edu).☆20Updated 2 years ago
- EMNLP 2021 Tutorial: Multi-Domain Multilingual Question Answering☆38Updated 3 years ago
- Source code accompanying the KONVENS 2019 paper "Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Em…☆65Updated 5 years ago
- UFSAC is a resource containing all WordNet Sense Annotated Corpora, and a Java library for manipulating them☆37Updated 2 years ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)☆48Updated 3 years ago
- M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer☆55Updated 2 years ago
- An implementation of GrASP (Shnarch et. al., 2017)☆21Updated 2 years ago
- A python tool for building large scale Wikipedia-based Information Retrieval datasets☆46Updated 3 years ago
- Automatically detect errors in annotated corpora.☆47Updated last year
- ☆75Updated 3 years ago
- This is the code for loading the SenseBERT model, described in our paper from ACL 2020.☆44Updated 2 years ago
- BERT models for many languages created from Wikipedia texts☆33Updated 4 years ago
- Semantically Structured Sentence Embeddings☆65Updated 6 months ago
- ☆54Updated 3 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆80Updated 9 months ago
- Build a dialog dataset from online books in many languages☆72Updated 2 years ago
- This repository hosts the code for a tokenizer of tweets.☆12Updated 6 years ago
- ☆21Updated 4 years ago
- Hyperparameter Search for AllenNLP☆139Updated last month
- Perspectrum: a dataset of claims, perspectives and evidence documents☆33Updated 5 years ago