amazon-science / tofuevalLinks
☆31Updated last year
Alternatives and similar repositories for tofueval
Users that are interested in tofueval are comparing it to the libraries listed below
Sorting:
- Code and models for the paper "Questions Are All You Need to Train a Dense Passage Retriever (TACL 2023)"☆62Updated 2 years ago
- Code base of In-Context Learning for Dialogue State tracking☆45Updated last year
- Code repo for SIGIR 2021 paper "Few-Shot Conversational Dense Retrieval"☆42Updated 3 years ago
- ☆50Updated 2 years ago
- First explanation metric (diagnostic report) for text generation evaluation☆62Updated 4 months ago
- We construct and introduce DIALFACT, a testing benchmark dataset crowd-annotated conversational claims, paired with pieces of evidence fr…☆42Updated 2 years ago
- Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors (ACL 2023)☆25Updated last year
- Code, datasets, and checkpoints for the paper "Improving Passage Retrieval with Zero-Shot Question Generation (EMNLP 2022)"☆101Updated 2 years ago
- Token-level Reference-free Hallucination Detection☆94Updated last year
- Detect hallucinated tokens for conditional sequence generation.☆64Updated 3 years ago
- The official implemetation of "Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks" (NAACL 2022).☆44Updated 2 years ago
- Dataset, metrics, and models for TACL 2023 paper MACSUM: Controllable Summarization with Mixed Attributes.☆34Updated last year
- Factual consistency checking model for abstractive summaries (NAACL-22 Findings)☆29Updated 3 years ago
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…☆44Updated 11 months ago
- Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks☆63Updated 3 years ago
- ☆98Updated last year
- Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Textual Style Transfer☆35Updated 2 years ago
- Code for Handling Divergent Reference Texts when Evaluating Table-to-Text Generation (Dhingra et al. 2019)☆31Updated 4 years ago
- ☆49Updated 2 years ago
- FRANK: Factuality Evaluation Benchmark☆57Updated 2 years ago
- Train Dense Passage Retriever (DPR) with a single GPU☆133Updated 4 years ago
- Official implementations for (1) BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation and (2) Discourse Centric …☆77Updated last year
- PyTorch code for "FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization" (NAACL 2022)☆39Updated 2 years ago
- ☆39Updated 2 years ago
- The code implementation of the EMNLP2022 paper: DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for Controllable Text Gene…☆26Updated last year
- ACL 2022: Just Rank: Rethinking Evaluation with Word and Sentence Similarities☆35Updated 2 years ago
- Resources for paper "DialSummEval: Revisiting summarization evaluation for dialogues"☆15Updated 2 years ago
- code associated with ACL 2021 DExperts paper☆115Updated 2 years ago
- Code and data accompanying the paper "TRUE: Re-evaluating Factual Consistency Evaluation".☆81Updated last month
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆41Updated last year