google-research-datasets / AIS
AIS is an evaluation framework for assessing whether the output of natural language models only contains information about the external world that is verifiable in source documents, or "Attributable to Identified Sources".
☆30Updated last year
Related projects ⓘ
Alternatives and complementary repositories for AIS
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…☆43Updated 3 months ago
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆39Updated 10 months ago
- Apps built using Inspired Cognition's Critique.☆58Updated last year
- Code and pre-trained models for "ReasonBert: Pre-trained to Reason with Distant Supervision", EMNLP'2021☆29Updated last year
- This repository contains the code for "How many data points is a prompt worth?"☆49Updated 3 years ago
- Official implementation of the paper "IteraTeR: Understanding Iterative Revision from Human-Written Text" (ACL 2022)☆76Updated 11 months ago
- This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”☆84Updated 2 years ago
- Resources for the shared task on conversational question answering SCAI-QReCC 2021☆27Updated 2 years ago
- ☆55Updated last year
- Code & data for EMNLP 2020 paper "MOCHA: A Dataset for Training and Evaluating Reading Comprehension Metrics".☆16Updated 2 years ago
- Code for NAACL 2022 paper "Reframing Human-AI Collaboration for Generating Free-Text Explanations"☆31Updated last year
- Companion repo for "Evaluating Verifiability in Generative Search Engines".☆81Updated last year
- Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://arxiv.org/abs/2406.13663☆15Updated 3 weeks ago
- ☆57Updated 2 years ago
- Code and data accompanying the paper "TRUE: Re-evaluating Factual Consistency Evaluation".☆71Updated 2 weeks ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆44Updated 11 months ago
- ☆35Updated last year
- ☆48Updated last year
- TBC☆26Updated 2 years ago
- Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"☆58Updated 7 months ago
- ☆44Updated last year
- ☆33Updated last year
- ☆97Updated 2 years ago
- Repository for Teaching Broad Reasoning Skills for Multi-Step QA by Generating Hard Contexts, EMNLP22☆17Updated last year
- ☆31Updated last year
- Repo for "On Learning to Summarize with Large Language Models as References"☆42Updated last year
- Efficient Memory-Augmented Transformers☆35Updated last year
- Code for preprint: Summarizing Differences between Text Distributions with Natural Language☆42Updated last year
- ☆95Updated last year
- A Human-LLM Collaborative Dataset for Generative Information-seeking with Attribution☆30Updated last year
- ☆29Updated 9 months ago