RefChecker provides automatic checking pipeline and benchmark dataset for detecting fine-grained hallucinations generated by Large Language Models.
☆415May 16, 2025Updated 9 months ago
Alternatives and similar repositories for RefChecker
Users that are interested in RefChecker are comparing it to the libraries listed below
Sorting:
- RAGChecker: A Fine-grained Framework For Diagnosing RAG☆1,056Dec 13, 2024Updated last year
- ☆22Feb 3, 2024Updated 2 years ago
- List of papers on hallucination detection in LLMs.☆1,053Jan 11, 2026Updated last month
- ACL2023 - AlignScore, a metric for factual consistency evaluation.☆152Mar 11, 2024Updated last year
- SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models☆602Jun 26, 2024Updated last year
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"☆227Dec 2, 2024Updated last year
- Fact-Checking the Output of Generative Large Language Models in both Annotation and Evaluation.☆114Jan 6, 2024Updated 2 years ago
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆63Dec 25, 2023Updated 2 years ago
- ☆215Apr 2, 2025Updated 10 months ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆43Feb 15, 2024Updated 2 years ago
- A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic…☆415Apr 13, 2025Updated 10 months ago
- This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.☆554Feb 12, 2024Updated 2 years ago
- ☆76Feb 16, 2024Updated 2 years ago
- ☆11Jan 3, 2024Updated 2 years ago
- Resources for Retrieval Augmentation for Commonsense Reasoning: A Unified Approach. EMNLP 2022.☆23Nov 23, 2022Updated 3 years ago
- Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.☆324Jul 10, 2025Updated 7 months ago
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"☆136Jun 5, 2024Updated last year
- [LREC-Coling 2024] PECC: Problem Extraction and Coding Challenges☆14May 30, 2024Updated last year
- Code and data for the FACTOR paper☆53Nov 15, 2023Updated 2 years ago
- The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>☆341Apr 25, 2024Updated last year
- Supercharge Your LLM Application Evaluations 🚀☆12,736Updated this week
- We believe the ability of an LLM to attribute the text that it generates is likely to be crucial for both system developers and users in …☆54Jul 28, 2023Updated 2 years ago
- Automated Evaluation of RAG Systems☆693Mar 28, 2025Updated 11 months ago
- FacTool: Factuality Detection in Generative AI☆913Aug 19, 2024Updated last year
- Benchmarking long-form factuality in large language models. Original code for our paper "Long-form factuality in large language models".☆667Feb 5, 2026Updated 3 weeks ago
- Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large …☆1,076Sep 27, 2025Updated 5 months ago
- Code and data accompanying the paper "TRUE: Re-evaluating Factual Consistency Evaluation".☆84Feb 20, 2026Updated last week
- A simple library for segmenting legal texts☆17Apr 22, 2023Updated 2 years ago
- ☆22Jan 13, 2025Updated last year
- FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data (NAACL 2025)☆15Jul 14, 2025Updated 7 months ago
- Convert JSON Schemas to simple, human-readable Markdown documentation. Repo archived in favor of fork: sbrunner/jsonschema2md2☆27Jul 12, 2023Updated 2 years ago
- An Easy-to-use Hallucination Detection Framework for LLMs.☆63Apr 21, 2024Updated last year
- [IJCAI 2024] FactCHD: Benchmarking Fact-Conflicting Hallucination Detection☆90Apr 28, 2024Updated last year
- The code of “Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning”☆17Feb 26, 2024Updated 2 years ago
- Retrieval-Augmented Generation battle!☆62Jul 31, 2025Updated 7 months ago
- LLM hallucination paper list☆332Mar 11, 2024Updated last year
- Code for "FactKB: Generalizable Factuality Evaluation using Language Models Enhanced with Factual Knowledge". EMNLP 2023.☆20Dec 25, 2023Updated 2 years ago
- [ICLR 2025 Oral] Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition☆18Nov 25, 2024Updated last year
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆34Aug 24, 2024Updated last year