allenai / winogrande
WinoGrande: An Adversarial Winograd Schema Challenge at Scale
☆88Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for winogrande
- Heuristic Analysis for NLI Systems☆124Updated 3 years ago
- ☆57Updated last year
- Faithfulness and factuality annotations of XSum summaries from our paper "On Faithfulness and Factuality in Abstractive Summarization" (h…☆81Updated 4 years ago
- ☆81Updated 2 years ago
- A framework for building semantic parsers (including neural module networks) with AllenNLP, built by the authors of AllenNLP☆107Updated 2 years ago
- A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations☆54Updated 2 years ago
- Code and data to support the paper "PAQ 65 Million Probably-Asked Questions andWhat You Can Do With Them"☆202Updated 3 years ago
- GLUCOSE: GeneraLized and COntextualized Story Explanations https://arxiv.org/abs/2009.07758☆92Updated 3 years ago
- How Contextual are Contextualized Word Representations?☆39Updated 4 years ago
- This repository houses the IMPlicature and PRESupposition diagnostic dataset (IMPPRES), consisting of >25k semiautomatically generated se…☆19Updated 3 years ago
- Hyperparameter Search for AllenNLP☆134Updated 4 years ago
- EMNLP DiscoEval paper☆42Updated 5 years ago
- An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous Open-domain Questions"☆118Updated 2 years ago
- A benchmark for understanding and evaluating rationales: http://www.eraserbenchmark.com/☆97Updated 2 years ago
- Repository for the Question Answering via Sentence Composition (QASC) dataset☆52Updated last year
- ☆158Updated 2 years ago
- A BART version of an open-domain QA model in a closed-book setup☆119Updated 4 years ago
- ☆58Updated 2 years ago
- ☆37Updated 3 years ago
- Diagnostic tests for linguistic capacities in language models☆66Updated 2 years ago
- FRANK: Factuality Evaluation Benchmark☆52Updated last year
- Commonsense Explanations Dataset and Code☆146Updated 3 years ago
- ☆46Updated 4 years ago
- Language model Prompt And Query Archive☆157Updated 3 years ago
- [EMNLP 2020] Collective HumAn OpinionS on Natural Language Inference Data☆33Updated 2 years ago
- Evaluating recurrent neural networks on predicting subject-verb agreement dependencies☆61Updated last year
- ☆45Updated last year
- SacreROUGE is a library dedicated to the use and development of text generation evaluation metrics with an emphasis on summarization.☆140Updated 2 years ago
- Code and Data for Evaluation WG☆41Updated 2 years ago
- Simple language-driven navigation tasks for studying compositional learning☆188Updated 4 years ago