allenai / winograndeLinks
WinoGrande: An Adversarial Winograd Schema Challenge at Scale
☆96Updated 5 years ago
Alternatives and similar repositories for winogrande
Users that are interested in winogrande are comparing it to the libraries listed below
Sorting:
- Heuristic Analysis for NLI Systems☆125Updated 4 years ago
- ☆59Updated 2 years ago
- A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations☆56Updated 2 years ago
- Faithfulness and factuality annotations of XSum summaries from our paper "On Faithfulness and Factuality in Abstractive Summarization" (h…☆82Updated 4 years ago
- Code and data to support the paper "PAQ 65 Million Probably-Asked Questions andWhat You Can Do With Them"☆203Updated 3 years ago
- A framework for building semantic parsers (including neural module networks) with AllenNLP, built by the authors of AllenNLP☆108Updated 3 years ago
- Automatic metrics for GEM tasks☆66Updated 2 years ago
- ☆97Updated 2 years ago
- Evaluating recurrent neural networks on predicting subject-verb agreement dependencies☆63Updated 2 years ago
- Code for ACL 2020 paper: USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation (https://arxiv.org/pdf/2005.0045…☆50Updated 2 years ago
- ☆58Updated 3 years ago
- XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning☆103Updated 4 years ago
- This is the official repository for NAACL 2021, "XOR QA: Cross-lingual Open-Retrieval Question Answering".☆79Updated 4 years ago
- An original implementation of the paper "CREPE: Open-Domain Question Answering with False Presuppositions"☆16Updated 7 months ago
- How Contextual are Contextualized Word Representations?☆41Updated 5 years ago
- ☆84Updated 2 years ago
- The Benchmark of Linguistic Minimal Pairs☆151Updated 2 years ago
- An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous Open-domain Questions"☆119Updated 3 years ago
- This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".☆88Updated 3 years ago
- SacreROUGE is a library dedicated to the use and development of text generation evaluation metrics with an emphasis on summarization.☆144Updated 2 years ago
- A benchmark for understanding and evaluating rationales: http://www.eraserbenchmark.com/☆96Updated 2 years ago
- ☆48Updated 2 years ago
- Diagnostic tests for linguistic capacities in language models☆66Updated 3 years ago
- Data for evaluating gender bias in coreference resolution systems.☆77Updated 6 years ago
- Commonsense Explanations Dataset and Code☆147Updated last week
- ☆38Updated 2 years ago
- Code and data for "A Systematic Assessment of Syntactic Generalization in Neural Language Models"☆28Updated 4 years ago
- REALSumm: Re-evaluating Evaluation in Text Summarization☆71Updated 2 years ago
- Language model Prompt And Query Archive☆158Updated 4 years ago
- Data and code for Kang et al., EMNLP 2019's paper titled "(Male, Bachelor) and (Female, Ph.D) have different connotations: Parallelly Ann…☆29Updated 5 years ago