shmsw25/FActScore

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/shmsw25/FActScore)

shmsw25 / FActScore

A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

☆450

Alternatives and similar repositories for FActScore

Users that are interested in FActScore are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yuxiaw / Factcheck-GPT
View on GitHub
Fact-Checking the Output of Generative Large Language Models in both Annotation and Evaluation.
☆116Jan 6, 2024Updated 2 years ago
RUCAIBox / HaluEval
View on GitHub
This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.
☆592Feb 12, 2024Updated 2 years ago
kttian / llm_factuality_tuning
View on GitHub
☆41May 2, 2024Updated 2 years ago
google-deepmind / long-form-factuality
View on GitHub
Benchmarking long-form factuality in large language models. Original code for our paper "Long-form factuality in large language models".
☆692Jun 18, 2026Updated last month
nayeon7lee / FactualityPrompt
View on GitHub
☆90Nov 11, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
katiekang1998 / llm_hallucinations
View on GitHub
☆18May 28, 2024Updated 2 years ago
HillZhang1999 / llm-hallucination-survey
View on GitHub
Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large …
☆1,085Sep 27, 2025Updated 9 months ago
hkust-nlp / felm
View on GitHub
Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)
☆65Dec 25, 2023Updated 2 years ago
yuh-zha / AlignScore
View on GitHub
ACL2023 - AlignScore, a metric for factual consistency evaluation.
☆164Mar 11, 2024Updated 2 years ago
potsawee / selfcheckgpt
View on GitHub
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
☆628Jun 26, 2024Updated 2 years ago
tingofurro / summac
View on GitHub
Codebase, data and models for the SummaC paper in TACL
☆110Jan 30, 2025Updated last year
princeton-nlp / ALCE
View on GitHub
[EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627
☆522Oct 9, 2024Updated last year
ryokamoi / wice
View on GitHub
This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.
☆43Dec 15, 2023Updated 2 years ago
Yale-LILY / ROSE
View on GitHub
☆41Jun 7, 2023Updated 3 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
chaitanyamalaviya / ExpertQA
View on GitHub
[Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers
☆139Mar 14, 2024Updated 2 years ago
allenai / FineGrainedRLHF
View on GitHub
☆283Jan 6, 2025Updated last year
AlexTMallen / adaptive-retrieval
View on GitHub
☆192Jul 2, 2025Updated last year
velocityCavalry / CREPE
View on GitHub
An original implementation of the paper "CREPE: Open-Domain Question Answering with False Presuppositions"
☆16Nov 5, 2024Updated last year
GAIR-NLP / factool
View on GitHub
FacTool: Factuality Detection in Generative AI
☆933Aug 19, 2024Updated last year
voidism / DoLa
View on GitHub
Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"
☆557Jul 12, 2026Updated last week
sylinrl / TruthfulQA
View on GitHub
TruthfulQA: Measuring How Models Imitate Human Falsehoods
☆934Jan 16, 2025Updated last year
salesforce / factCC
View on GitHub
Resources for the "Evaluating the Factual Consistency of Abstractive Text Summarization" paper
☆305May 1, 2025Updated last year
Yixiao-Song / VeriScore
View on GitHub
☆39Dec 17, 2025Updated 7 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
RUCAIBox / HaluEval-2.0
View on GitHub
☆50Jan 7, 2024Updated 2 years ago
EdinburghNLP / awesome-hallucination-detection
View on GitHub
List of papers on hallucination detection in LLMs.
☆1,120Jun 6, 2026Updated last month
nelson-liu / evaluating-verifiability-in-generative-search-engines
View on GitHub
Companion repo for "Evaluating Verifiability in Generative Search Engines".
☆87May 12, 2023Updated 3 years ago
yinzhangyue / SelfAware
View on GitHub
Do Large Language Models Know What They Don’t Know?
☆103Nov 8, 2024Updated last year
likenneth / honest_llama
View on GitHub
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
☆581Jan 28, 2025Updated last year
facebookresearch / NPM
View on GitHub
The original implementation of Min et al. "Nonparametric Masked Language Modeling" (paper https//arxiv.org/abs/2212.01349)
☆159Jan 6, 2023Updated 3 years ago
armingh2000 / FactScoreLite
View on GitHub
FactScoreLite is an implementation of the FactScore metric, designed for detailed accuracy assessment in text generation. This package bu…
☆14Apr 25, 2024Updated 2 years ago
wangcunxiang / LLM-Factuality-Survey
View on GitHub
The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>
☆339Mar 28, 2026Updated 3 months ago
jinlanfu / GPTScore
View on GitHub
Source Code of Paper "GPTScore: Evaluate as You Desire"
☆258Feb 21, 2023Updated 3 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
microsoft / HaDes
View on GitHub
Token-level Reference-free Hallucination Detection
☆97Jul 25, 2023Updated 2 years ago
google-research / true
View on GitHub
Code and data accompanying the paper "TRUE: Re-evaluating Factual Consistency Evaluation".
☆92Jun 16, 2026Updated last month
AkariAsai / self-rag
View on GitHub
This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai,…
☆2,410May 25, 2024Updated 2 years ago
violet-zct / fairseq-detect-hallucination
View on GitHub
Detect hallucinated tokens for conditional sequence generation.
☆64Apr 15, 2022Updated 4 years ago
balevinstein / Probes
View on GitHub
☆58Jun 30, 2023Updated 3 years ago
open-compass / ANAH
View on GitHub
[ACL 2024] ANAH & [NeurIPS 2024] ANAH-v2 & [ICLR 2025] Mask-DPO
☆66Apr 30, 2025Updated last year
neulab / BARTScore
View on GitHub
BARTScore: Evaluating Generated Text as Text Generation
☆368Jun 27, 2022Updated 4 years ago