google-research-datasets / AIS
AIS is an evaluation framework for assessing whether the output of natural language models only contains information about the external world that is verifiable in source documents, or "Attributable to Identified Sources".
☆31Updated 2 years ago
Alternatives and similar repositories for AIS:
Users that are interested in AIS are comparing it to the libraries listed below
- Efficient Memory-Augmented Transformers☆34Updated 2 years ago
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…☆43Updated 5 months ago
- Query-focused summarization data☆41Updated last year
- ☆36Updated last year
- Code and pre-trained models for "ReasonBert: Pre-trained to Reason with Distant Supervision", EMNLP'2021☆29Updated last year
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆40Updated last year
- This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”☆85Updated 2 years ago
- Resources for the shared task on conversational question answering SCAI-QReCC 2021☆27Updated 2 years ago
- ☆48Updated last year
- Dense hybrid representations for text retrieval☆61Updated last year
- Official implementation of the paper "IteraTeR: Understanding Iterative Revision from Human-Written Text" (ACL 2022)☆78Updated last year
- Official codebase accompanying our ACL 2022 paper "RELiC: Retrieving Evidence for Literary Claims" (https://relic.cs.umass.edu).☆20Updated 2 years ago
- ☆97Updated 2 years ago
- Code for NAACL 2022 paper "Reframing Human-AI Collaboration for Generating Free-Text Explanations"☆31Updated last year
- Code for the paper "Simulating Bandit Learning from User Feedback for Extractive Question Answering".☆18Updated 2 years ago
- The dataset and code for ACL 2022 paper "SciNLI: A Corpus for Natural Language Inference on Scientific Text" are released here.☆26Updated last year
- Detect hallucinated tokens for conditional sequence generation.☆64Updated 2 years ago
- ☆55Updated 2 years ago
- ☆58Updated 2 years ago
- Apps built using Inspired Cognition's Critique.☆58Updated last year
- Few-shot NLP benchmark for unified, rigorous eval☆91Updated 2 years ago
- ☆33Updated last year
- ☆15Updated 3 years ago
- TBC☆26Updated 2 years ago
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆72Updated 2 years ago
- This repository contains the code for "How many data points is a prompt worth?"☆48Updated 3 years ago
- Benchmarking Generalization to New Tasks from Natural Language Instructions☆26Updated 3 years ago
- FaVIQ: Fact Verification from Information-seeking Questions☆43Updated 2 years ago
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆26Updated 3 years ago
- This is the official repository for NAACL 2021, "XOR QA: Cross-lingual Open-Retrieval Question Answering".☆79Updated 3 years ago