google-research-datasets / AISLinks
AIS is an evaluation framework for assessing whether the output of natural language models only contains information about the external world that is verifiable in source documents, or "Attributable to Identified Sources".
☆31Updated 2 years ago
Alternatives and similar repositories for AIS
Users that are interested in AIS are comparing it to the libraries listed below
Sorting:
- This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”☆85Updated 3 years ago
- Official implementation of the paper "IteraTeR: Understanding Iterative Revision from Human-Written Text" (ACL 2022)☆78Updated last year
- ☆51Updated 2 years ago
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…☆44Updated last year
- Query-focused summarization data☆42Updated 2 years ago
- Code and pre-trained models for "ReasonBert: Pre-trained to Reason with Distant Supervision", EMNLP'2021☆29Updated 2 years ago
- ☆97Updated 3 years ago
- Contrastive Fact Verification☆73Updated 2 years ago
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆42Updated last year
- Few-shot NLP benchmark for unified, rigorous eval☆92Updated 3 years ago
- The official implemetation of "Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks" (NAACL 2022).☆44Updated 2 years ago
- Code and data accompanying the paper "TRUE: Re-evaluating Factual Consistency Evaluation".☆81Updated last month
- ☆58Updated 3 years ago
- ☆37Updated 8 months ago
- Code for Evaluating Explanations for Reading Comprehension with Realistic Counterfactuals.☆18Updated 4 years ago
- ☆100Updated last year
- PyTorch code for "FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization" (NAACL 2022)☆39Updated 2 years ago
- This repository contains the code for "How many data points is a prompt worth?"☆48Updated 4 years ago
- XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning☆105Updated 4 years ago
- Apps built using Inspired Cognition's Critique.☆58Updated 2 years ago
- The data and code for EmailSum☆61Updated 4 years ago
- ☆54Updated 2 years ago
- FaVIQ: Fact Verification from Information-seeking Questions☆43Updated 2 years ago
- Code, data, and pretrained models for the paper "Generating Wikipedia Article Sections from Diverse Data Sources"☆20Updated 4 years ago
- EMNLP 2021 - CTC: A Unified Framework for Evaluating Natural Language Generation☆98Updated 2 years ago
- Resources for the shared task on conversational question answering SCAI-QReCC 2021☆29Updated 3 years ago
- Transfer Learning in Dialogue Benchmarking Toolkit☆14Updated 2 years ago
- Code and dataset for the EMNLP 2021 Finding paper "Can NLI Models Verify QA Systems’ Predictions?"☆25Updated 2 years ago
- ☆30Updated 3 years ago
- Detect hallucinated tokens for conditional sequence generation.☆64Updated 3 years ago