yuh-zha / AlignScoreView external linksLinks
ACL2023 - AlignScore, a metric for factual consistency evaluation.
☆151Mar 11, 2024Updated last year
Alternatives and similar repositories for AlignScore
Users that are interested in AlignScore are comparing it to the libraries listed below
Sorting:
- FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data (NAACL 2025)☆14Jul 14, 2025Updated 7 months ago
- A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic…☆415Apr 13, 2025Updated 10 months ago
- Resources for the "Evaluating the Factual Consistency of Abstractive Text Summarization" paper☆309May 1, 2025Updated 9 months ago
- Token-level Reference-free Hallucination Detection☆98Jul 25, 2023Updated 2 years ago
- Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Study☆43Mar 8, 2023Updated 2 years ago
- Align, a general text alignment function☆15Dec 7, 2023Updated 2 years ago
- Codebase, data and models for the SummaC paper in TACL☆108Jan 30, 2025Updated last year
- Detect hallucinated tokens for conditional sequence generation.☆64Apr 15, 2022Updated 3 years ago
- ☆26Nov 21, 2022Updated 3 years ago
- Code for NAACL 2025 paper "AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge"☆16Oct 14, 2024Updated last year
- RARR: Researching and Revising What Language Models Say, Using Language Models☆52Jun 22, 2023Updated 2 years ago
- RefChecker provides automatic checking pipeline and benchmark dataset for detecting fine-grained hallucinations generated by Large Langua…☆417May 16, 2025Updated 8 months ago
- Code for Controlling Hallucinations at Word Level in Data-to-Text Generation (C. Rebuffel, M. Roberti, L. Soulier, G. Scoutheeten, R. Can…☆16Jun 12, 2023Updated 2 years ago
- Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors (ACL 2023)☆28Mar 26, 2024Updated last year
- Repository for EMNLP 2022 Paper: Towards a Unified Multi-Dimensional Evaluator for Text Generation☆214Feb 10, 2024Updated 2 years ago
- ☆27Nov 6, 2022Updated 3 years ago
- Source Code of Paper "GPTScore: Evaluate as You Desire"☆258Feb 21, 2023Updated 2 years ago
- Question Answering and Generation for Summarization☆71Nov 27, 2022Updated 3 years ago
- This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.☆554Feb 12, 2024Updated 2 years ago
- List of papers on hallucination detection in LLMs.☆1,046Jan 11, 2026Updated last month
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆63Dec 25, 2023Updated 2 years ago
- Suri: Multi-constraint instruction following for long-form text generation (EMNLP’24)☆27Oct 3, 2025Updated 4 months ago
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆30Mar 5, 2024Updated last year
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆136Mar 14, 2024Updated last year
- Website for release of TellMeWhy dataset for why question answering☆14Nov 11, 2022Updated 3 years ago
- ☆13May 21, 2023Updated 2 years ago
- benchmarks for evaluating MT models☆11Jun 26, 2024Updated last year
- codes for "Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models"☆12Feb 10, 2025Updated last year
- Bridging Retrieval and Inference through Evidence Fusion☆12Oct 20, 2025Updated 3 months ago
- Prompt-Guided Retrieval For Non-Knowledge-Intensive Tasks☆12Sep 1, 2023Updated 2 years ago
- ☆33Dec 17, 2025Updated last month
- Repository collecting resources and best practices to improve experimental rigour in deep learning research.☆27Mar 30, 2023Updated 2 years ago
- ☆31May 10, 2024Updated last year
- ☆75Feb 16, 2024Updated last year
- [NLPCC 2024] Shared Task 10: Regulating Large Language Models☆14Jun 12, 2024Updated last year
- ☆11Aug 10, 2022Updated 3 years ago
- https://arxiv.org/abs/2404.10917☆14Mar 18, 2025Updated 10 months ago
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"☆225Dec 2, 2024Updated last year
- [ICML 2025] Official repository for paper "OR-Bench: An Over-Refusal Benchmark for Large Language Models"☆21Mar 4, 2025Updated 11 months ago