yuxiaw / Factcheck-GPTLinks

Fact-Checking the Output of Generative Large Language Models in both Annotation and Evaluation.

☆104

Alternatives and similar repositories for Factcheck-GPT

Users that are interested in Factcheck-GPT are comparing it to the libraries listed below

Sorting:

chaitanyamalaviya / ExpertQA
[Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers
☆131Updated last year
ParticleMedia / RAGTruth
Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"
☆192Updated 8 months ago
anthonywchen / RARR
RARR: Researching and Revising What Language Models Say, Using Language Models
☆48Updated 2 years ago
abhika-m / FAVA
☆73Updated last year
zorazrw / filco
[Preprint] Learning to Filter Context for Retrieval-Augmented Generaton
☆194Updated last year
yuxiaw / OpenFactCheck
☆53Updated last year
zjunlp / FactCHD
[IJCAI 2024] FactCHD: Benchmarking Fact-Conflicting Hallucination Detection
☆88Updated last year
zhudotexe / fanoutqa
Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language Models (ACL 2024)
☆54Updated last month
xsc1234 / Search-in-the-Chain
Code for Search-in-the-Chain: Towards Accurate, Credible and Traceable Large Language Models for Knowledge-intensive Tasks
☆57Updated last year
AlexTMallen / adaptive-retrieval
☆184Updated last month
realtimeqa / realtimeqa_public
☆76Updated last year
OSU-NLP-Group / AttrScore
Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"
☆56Updated 2 years ago
yuh-zha / AlignScore
ACL2023 - AlignScore, a metric for factual consistency evaluation.
☆136Updated last year
StonyBrookNLP / musique
Repository for MuSiQue: Multi-hop Questions via Single-hop Question Composition, TACL 2022
☆154Updated last year
DaoD / INTERS
This is the repository for our paper "INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning"
☆204Updated 7 months ago
xlang-ai / BRIGHT
BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval
☆153Updated 2 months ago
allenai / wimbd
What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets
☆223Updated 8 months ago
ielab / PromptReps
Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval
☆50Updated last month
oriyor / reasoning-on-cots
Implementation of the paper: "Answering Questions by Meta-Reasoning over Multiple Chains of Thought"
☆96Updated last year
OSU-NLP-Group / LLM-Knowledge-Conflict
[ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"
☆73Updated last year
McGill-NLP / instruct-qa
Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"
☆86Updated last year
Leezekun / Directional-Stimulus-Prompting
[NeurIPS 2023] Codebase for the paper: "Guiding Large Language Models with Directional Stimulus Prompting"
☆112Updated 2 years ago
chentong0 / factoid-wiki
Dense X Retrieval: What Retrieval Granularity Should We Use?
☆159Updated last year
shizhediao / R-Tuning
[NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…
☆114Updated last year
AI21Labs / in-context-ralm
☆284Updated last year
salesforce / factualNLG
Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"
☆59Updated 6 months ago
microsoft / HaDes
Token-level Reference-free Hallucination Detection
☆96Updated 2 years ago
OSU-NLP-Group / TableLlama
[NAACL'24] Dataset, code and models for "TableLlama: Towards Open Large Generalist Models for Tables".
☆130Updated last year
Alab-NII / 2wikimultihop
☆115Updated last year
Hannibal046 / SelfMemory
[Neurips2023] Source code for Lift Yourself Up: Retrieval-augmented Text Generation with Self Memory
☆61Updated 2 years ago