allenai / winogrande
WinoGrande: An Adversarial Winograd Schema Challenge at Scale
☆90Updated 4 years ago
Alternatives and similar repositories for winogrande:
Users that are interested in winogrande are comparing it to the libraries listed below
- Heuristic Analysis for NLI Systems☆124Updated 3 years ago
- Faithfulness and factuality annotations of XSum summaries from our paper "On Faithfulness and Factuality in Abstractive Summarization" (h…☆81Updated 4 years ago
- A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations☆54Updated 2 years ago
- ☆97Updated 2 years ago
- An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous Open-domain Questions"☆118Updated 2 years ago
- Code and data to support the paper "PAQ 65 Million Probably-Asked Questions andWhat You Can Do With Them"☆201Updated 3 years ago
- XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning☆101Updated 3 years ago
- SacreROUGE is a library dedicated to the use and development of text generation evaluation metrics with an emphasis on summarization.☆139Updated 2 years ago
- How Contextual are Contextualized Word Representations?☆39Updated 4 years ago
- FRANK: Factuality Evaluation Benchmark☆52Updated 2 years ago
- Code and Data for Evaluation WG☆41Updated 2 years ago
- Semantic parsers based on encoder-decoder framework☆90Updated last year
- Diagnostic tests for linguistic capacities in language models☆66Updated 2 years ago
- ☆58Updated last year
- Code for ACL 2020 paper: USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation (https://arxiv.org/pdf/2005.0045…☆50Updated 2 years ago
- ☆82Updated 2 years ago
- This repository contains the code for "How many data points is a prompt worth?"☆48Updated 3 years ago
- Automatic metrics for GEM tasks☆63Updated 2 years ago
- A benchmark for understanding and evaluating rationales: http://www.eraserbenchmark.com/☆96Updated 2 years ago
- A framework for building semantic parsers (including neural module networks) with AllenNLP, built by the authors of AllenNLP☆107Updated 2 years ago
- Code to reproduce LREC Paper Simplifying Semantic Annotations of SMCalFlow☆25Updated 9 months ago
- The Benchmark of Linguistic Minimal Pairs☆144Updated 2 years ago
- ☆42Updated 4 years ago
- ☆58Updated 2 years ago
- ☆36Updated last year
- This repository houses the IMPlicature and PRESupposition diagnostic dataset (IMPPRES), consisting of >25k semiautomatically generated se…☆19Updated 3 years ago
- Dataset for NAACL 2021 paper: "DART: Open-Domain Structured Data Record to Text Generation"☆148Updated 2 years ago
- A BART version of an open-domain QA model in a closed-book setup☆120Updated 4 years ago
- Code to support the paper "Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets"☆66Updated 3 years ago
- REALSumm: Re-evaluating Evaluation in Text Summarization☆71Updated 2 years ago