yuh-zha/AlignScore

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yuh-zha/AlignScore)

yuh-zha / AlignScore

ACL2023 - AlignScore, a metric for factual consistency evaluation.

☆153

Alternatives and similar repositories for AlignScore

Users that are interested in AlignScore are comparing it to the libraries listed below

Sorting:

derenlei / FactCG
View on GitHub
FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data (NAACL 2025)
☆15Jul 14, 2025Updated 7 months ago
salesforce / factCC
View on GitHub
Resources for the "Evaluating the Factual Consistency of Abstractive Text Summarization" paper
☆309May 1, 2025Updated 10 months ago
microsoft / HaDes
View on GitHub
Token-level Reference-free Hallucination Detection
☆97Jul 25, 2023Updated 2 years ago
krystalan / chatgpt_as_nlg_evaluator
View on GitHub
Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Study
☆43Mar 8, 2023Updated 3 years ago
yuh-zha / Align
View on GitHub
Align, a general text alignment function
☆15Dec 7, 2023Updated 2 years ago
tingofurro / summac
View on GitHub
Codebase, data and models for the SummaC paper in TACL
☆109Jan 30, 2025Updated last year
violet-zct / fairseq-detect-hallucination
View on GitHub
Detect hallucinated tokens for conditional sequence generation.
☆64Apr 15, 2022Updated 3 years ago
google-research / true
View on GitHub
Code and data accompanying the paper "TRUE: Re-evaluating Factual Consistency Evaluation".
☆84Feb 20, 2026Updated 2 weeks ago
r-three / fib
View on GitHub
☆26Nov 21, 2022Updated 3 years ago
HanNight / AdaCAD
View on GitHub
Code for NAACL 2025 paper "AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge"
☆16Mar 2, 2026Updated last week
anthonywchen / RARR
View on GitHub
RARR: Researching and Revising What Language Models Say, Using Language Models
☆53Jun 22, 2023Updated 2 years ago
Liyan06 / AggreFact
View on GitHub
Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors (ACL 2023)
☆28Mar 26, 2024Updated last year
maszhongming / UniEval
View on GitHub
Repository for EMNLP 2022 Paper: Towards a Unified Multi-Dimensional Evaluator for Text Generation
☆216Feb 10, 2024Updated 2 years ago
mcao516 / EntFA
View on GitHub
☆27Nov 6, 2022Updated 3 years ago
jinlanfu / GPTScore
View on GitHub
Source Code of Paper "GPTScore: Evaluate as You Desire"
☆259Feb 21, 2023Updated 3 years ago
W4ngatang / qags
View on GitHub
Question Answering and Generation for Summarization
☆73Nov 27, 2022Updated 3 years ago
RUCAIBox / HaluEval
View on GitHub
This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.
☆556Feb 12, 2024Updated 2 years ago
EdinburghNLP / awesome-hallucination-detection
View on GitHub
List of papers on hallucination detection in LLMs.
☆1,055Jan 11, 2026Updated last month
hkust-nlp / felm
View on GitHub
Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)
☆63Dec 25, 2023Updated 2 years ago
chtmp223 / suri
View on GitHub
Suri: Multi-constraint instruction following for long-form text generation (EMNLP’24)
☆27Oct 3, 2025Updated 5 months ago
deep-spin / hallucinations-in-nmt
View on GitHub
☆20Jan 16, 2024Updated 2 years ago
CriticBench / CriticBench
View on GitHub
[ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
☆30Mar 5, 2024Updated 2 years ago
chaitanyamalaviya / ExpertQA
View on GitHub
[Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers
☆137Mar 14, 2024Updated last year
zhuyunqi96 / LoraLPrun
View on GitHub
☆13May 21, 2023Updated 2 years ago
THUNLP-MT / PGRA
View on GitHub
Prompt-Guided Retrieval For Non-Knowledge-Intensive Tasks
☆12Sep 1, 2023Updated 2 years ago
StonyBrookNLP / tellmewhy
View on GitHub
Website for release of TellMeWhy dataset for why question answering
☆14Nov 11, 2022Updated 3 years ago
Miaoranmmm / SelfChecker
View on GitHub
codes for "Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models"
☆12Feb 10, 2025Updated last year
Helsinki-NLP / OPUS-MT-testsets
View on GitHub
benchmarks for evaluating MT models
☆11Jun 26, 2024Updated last year
PrimerAI / blanc
View on GitHub
Human-free quality estimation of document summaries
☆97Dec 1, 2025Updated 3 months ago
Yixiao-Song / VeriScore
View on GitHub
☆33Dec 17, 2025Updated 2 months ago
zorazrw / odex
View on GitHub
[EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation
☆49Dec 22, 2023Updated 2 years ago
amazon-science / tofueval
View on GitHub
☆32May 10, 2024Updated last year
ParticleMedia / RAGTruth
View on GitHub
Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"
☆229Dec 2, 2024Updated last year
D2I-ai / Route
View on GitHub
ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL (ICLR 2025 Pytorch Code)
☆17May 15, 2025Updated 9 months ago
vidhishanair / FactEdit
View on GitHub
☆14Aug 30, 2023Updated 2 years ago
yale-nlp / ODSum
View on GitHub
Data and code for paper "ODSum: New Benchmarks for Open Domain Multi-Document Summarization"
☆11Sep 20, 2024Updated last year
DianboWork / M3T-CNERTA
View on GitHub
☆11Aug 10, 2022Updated 3 years ago
zjunlp / NLPCC2024_RegulatingLLM
View on GitHub
[NLPCC 2024] Shared Task 10: Regulating Large Language Models
☆14Jun 12, 2024Updated last year
justincui03 / or-bench
View on GitHub
[ICML 2025] Official repository for paper "OR-Bench: An Over-Refusal Benchmark for Large Language Models"
☆23Mar 4, 2025Updated last year