google-research-datasets / AIS
AIS is an evaluation framework for assessing whether the output of natural language models only contains information about the external world that is verifiable in source documents, or "Attributable to Identified Sources".
☆31Updated 2 years ago
Alternatives and similar repositories for AIS:
Users that are interested in AIS are comparing it to the libraries listed below
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…☆43Updated 7 months ago
- Query-focused summarization data☆41Updated 2 years ago
- TBC☆26Updated 2 years ago
- ☆58Updated 2 years ago
- This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”☆85Updated 2 years ago
- Code and pre-trained models for "ReasonBert: Pre-trained to Reason with Distant Supervision", EMNLP'2021☆29Updated 2 years ago
- ☆33Updated last year
- ☆38Updated last year
- Resources for the shared task on conversational question answering SCAI-QReCC 2021☆29Updated 2 years ago
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆26Updated 3 years ago
- Efficient Memory-Augmented Transformers☆34Updated 2 years ago
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆41Updated last year
- Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/☆21Updated 3 weeks ago
- Dense hybrid representations for text retrieval☆62Updated last year
- ☆97Updated 2 years ago
- ☆48Updated 2 years ago
- FaVIQ: Fact Verification from Information-seeking Questions☆43Updated 2 years ago
- Apps built using Inspired Cognition's Critique.☆58Updated 2 years ago
- ☆54Updated 2 years ago
- Code for paper "Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language?"☆21Updated 4 years ago
- We are creating a challenging new benchmark MultiReQA: A Cross-Domain Evaluation for Retrieval Question Answering Models. Retrieval quest…☆31Updated 4 years ago
- Detect hallucinated tokens for conditional sequence generation.☆64Updated 2 years ago
- ☆38Updated 3 months ago
- Official implementation of the paper "IteraTeR: Understanding Iterative Revision from Human-Written Text" (ACL 2022)☆78Updated last year
- The official implemetation of "Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks" (NAACL 2022).☆43Updated 2 years ago
- Code for preprint: Summarizing Differences between Text Distributions with Natural Language☆42Updated 2 years ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆47Updated last year
- Few-shot NLP benchmark for unified, rigorous eval☆91Updated 2 years ago
- Code & data for EMNLP 2020 paper "MOCHA: A Dataset for Training and Evaluating Reading Comprehension Metrics".☆16Updated 2 years ago
- ☆15Updated 3 years ago