google-deepmind/long-form-factuality

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/google-deepmind/long-form-factuality)

google-deepmind / long-form-factuality

Benchmarking long-form factuality in large language models. Original code for our paper "Long-form factuality in large language models".

☆692

Alternatives and similar repositories for long-form-factuality

Users that are interested in long-form-factuality are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

shmsw25 / FActScore
View on GitHub
A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic…
☆448Apr 13, 2025Updated last year
yuxiaw / Factcheck-GPT
View on GitHub
Fact-Checking the Output of Generative Large Language Models in both Annotation and Evaluation.
☆116Jan 6, 2024Updated 2 years ago
abhika-m / FAVA
View on GitHub
☆77Feb 16, 2024Updated 2 years ago
GAIR-NLP / factool
View on GitHub
FacTool: Factuality Detection in Generative AI
☆933Aug 19, 2024Updated last year
Miaoranmmm / SelfChecker
View on GitHub
codes for "Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models"
☆12Feb 10, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
RUCAIBox / HaluEval-2.0
View on GitHub
☆50Jan 7, 2024Updated 2 years ago
microsoft / FILM
View on GitHub
Official repo for "Make Your LLM Fully Utilize the Context"
☆275May 15, 2024Updated 2 years ago
chaitanyamalaviya / ExpertQA
View on GitHub
[Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers
☆139Mar 14, 2024Updated 2 years ago
openai / simple-evals
View on GitHub
☆4,575Apr 22, 2026Updated 2 months ago
google-deepmind / loft
View on GitHub
LOFT: A 1 Million+ Token Long-Context Benchmark
☆237Apr 13, 2026Updated 3 months ago
DaoD / INTERS
View on GitHub
This is the repository for our paper "INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning"
☆208Feb 18, 2026Updated 5 months ago
hkust-nlp / felm
View on GitHub
Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)
☆65Dec 25, 2023Updated 2 years ago
YiCheng98 / IntegrativeDecoding
View on GitHub
Official Implementation for the paper "Integrative Decoding: Improving Factuality via Implicit Self-consistency"
☆33Apr 12, 2025Updated last year
XuezheMax / megalodon
View on GitHub
Reference implementation of Megalodon 7B model
☆526May 17, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Yixiao-Song / VeriScore
View on GitHub
☆39Dec 17, 2025Updated 7 months ago
allenai / lumos
View on GitHub
Code and data for "Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs"
☆478Mar 19, 2024Updated 2 years ago
xfactlab / orpo
View on GitHub
Official repository for ORPO
☆480May 31, 2024Updated 2 years ago
ryokamoi / wice
View on GitHub
This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.
☆42Dec 15, 2023Updated 2 years ago
katiekang1998 / llm_hallucinations
View on GitHub
☆18May 28, 2024Updated 2 years ago
du-nlp-lab / MLR-Copilot
View on GitHub
☆70Mar 30, 2025Updated last year
potsawee / selfcheckgpt
View on GitHub
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
☆627Jun 26, 2024Updated 2 years ago
XueFuzhao / OpenMoE
View on GitHub
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
☆1,691Mar 8, 2024Updated 2 years ago
microsoft / rho
View on GitHub
Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.
☆470Apr 18, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
nayeon7lee / FactualityPrompt
View on GitHub
☆90Nov 11, 2022Updated 3 years ago
armingh2000 / FactScoreLite
View on GitHub
FactScoreLite is an implementation of the FactScore metric, designed for detailed accuracy assessment in text generation. This package bu…
☆14Apr 25, 2024Updated 2 years ago
princeton-nlp / SimPO
View on GitHub
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
☆956Feb 16, 2025Updated last year
datamllab / LongLM
View on GitHub
[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
☆668Jun 1, 2024Updated 2 years ago
FranxYao / Long-Context-Data-Engineering
View on GitHub
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
☆501Mar 19, 2024Updated 2 years ago
Re-Align / URIAL
View on GitHub
☆316Jun 9, 2024Updated 2 years ago
OpenMOSS / HalluQA
View on GitHub
Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"
☆139Jun 5, 2024Updated 2 years ago
zjunlp / FactCHD
View on GitHub
[IJCAI 2024] FactCHD: Benchmarking Fact-Conflicting Hallucination Detection
☆90Apr 28, 2024Updated 2 years ago
OpenBMB / Eurus
View on GitHub
☆322Sep 18, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
microsoft / ConstrainedReasoner
View on GitHub
☆13Aug 26, 2024Updated last year
EdinburghNLP / awesome-hallucination-detection
View on GitHub
List of papers on hallucination detection in LLMs.
☆1,120Jun 6, 2026Updated last month
huggingface / datatrove
View on GitHub
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
☆3,214Updated this week
TIGER-AI-Lab / MAmmoTH2
View on GitHub
Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]
☆146Oct 27, 2024Updated last year
arcee-ai / mergekit
View on GitHub
Tools for merging pretrained large language models.
☆7,246Jun 17, 2026Updated last month
VITA-Group / Q-GaLore
View on GitHub
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
☆206Jul 17, 2024Updated 2 years ago
zjunlp / EasyEdit
View on GitHub
[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
☆2,878Jul 14, 2026Updated last week