allenai / winogrande
WinoGrande: An Adversarial Winograd Schema Challenge at Scale
☆87Updated 4 years ago
Related projects: ⓘ
- Heuristic Analysis for NLI Systems☆125Updated 3 years ago
- ☆57Updated last year
- A BART version of an open-domain QA model in a closed-book setup☆120Updated 4 years ago
- Faithfulness and factuality annotations of XSum summaries from our paper "On Faithfulness and Factuality in Abstractive Summarization" (h…☆80Updated 3 years ago
- A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations☆54Updated 2 years ago
- Code and data to support the paper "PAQ 65 Million Probably-Asked Questions andWhat You Can Do With Them"☆200Updated 3 years ago
- A framework for building semantic parsers (including neural module networks) with AllenNLP, built by the authors of AllenNLP☆107Updated 2 years ago
- SacreROUGE is a library dedicated to the use and development of text generation evaluation metrics with an emphasis on summarization.☆135Updated last year
- A benchmark for understanding and evaluating rationales: http://www.eraserbenchmark.com/☆96Updated last year
- Automatic metrics for GEM tasks☆61Updated last year
- Data for evaluating gender bias in coreference resolution systems.☆65Updated 5 years ago
- Code and Data for Evaluation WG☆41Updated 2 years ago
- ☆79Updated 2 years ago
- An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous Open-domain Questions"☆116Updated 2 years ago
- This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".☆85Updated 3 years ago
- Code to reproduce LREC Paper Simplifying Semantic Annotations of SMCalFlow☆25Updated 5 months ago
- A library for finding knowledge neurons in pretrained transformer models.☆145Updated 2 years ago
- GLUCOSE: GeneraLized and COntextualized Story Explanations https://arxiv.org/abs/2009.07758☆92Updated 3 years ago
- [EMNLP 2020] Collective HumAn OpinionS on Natural Language Inference Data☆33Updated 2 years ago
- Official repository for "Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems"☆66Updated 2 years ago
- Few-shot NLP benchmark for unified, rigorous eval☆91Updated 2 years ago
- ☆57Updated 2 years ago
- This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”☆83Updated 2 years ago
- To analyze and remove gender bias in coreference resolution systems☆74Updated 3 years ago
- Dataset + classifier tools to study social perception biases in natural language generation☆64Updated last year
- Code and data for the paper: "Unsupervised Common Sense Question Answering with Self-Talk"☆78Updated 3 years ago
- Evaluating recurrent neural networks on predicting subject-verb agreement dependencies☆61Updated last year
- REALSumm: Re-evaluating Evaluation in Text Summarization☆71Updated last year
- Repository for the Question Answering via Sentence Composition (QASC) dataset☆51Updated last year
- Diagnostic tests for linguistic capacities in language models☆66Updated 2 years ago