THU-KEG/LRM-FactEval

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/THU-KEG/LRM-FactEval)

THU-KEG / LRM-FactEval

☆17

Alternatives and similar repositories for LRM-FactEval

Users that are interested in LRM-FactEval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

nusnlp / FSPO
View on GitHub
Official code for our paper "Reasoning Models Hallucinate More: Factuality-Aware Reinforcement Learning for Large Reasoning Models"
☆26Oct 31, 2025Updated 8 months ago
yizhongw / truthfulqa_reeval
View on GitHub
☆12Mar 7, 2024Updated 2 years ago
kkk-an / UltraIF
View on GitHub
Code of EMNLP 2025 paper 'UltraIF: Advancing Instruction Following from the Wild'.
☆21Apr 3, 2025Updated last year
armingh2000 / FactScoreLite
View on GitHub
FactScoreLite is an implementation of the FactScore metric, designed for detailed accuracy assessment in text generation. This package bu…
☆14Apr 25, 2024Updated 2 years ago
HanNight / AdaCAD
View on GitHub
Code for NAACL 2025 paper "AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge"
☆16Mar 2, 2026Updated 4 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
satrams / rent-rl
View on GitHub
RENT (Reinforcement Learning via Entropy Minimization) is an unsupervised method for training reasoning LLMs.
☆42Oct 31, 2025Updated 8 months ago
princeton-pli / QRHead
View on GitHub
QRHead: Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking
☆40Jan 20, 2026Updated 6 months ago
VITA-Group / Ms-PoE
View on GitHub
"Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiw…
☆35May 7, 2024Updated 2 years ago
IAAR-Shanghai / FastMem
View on GitHub
Fast Memorization of Prompt Improves Context Awareness of Large Language Models (Findings of EMNLP 2024)
☆22Oct 22, 2024Updated last year
sophicle / tokens
View on GitHub
☆19May 12, 2026Updated 2 months ago
aimagelab / VHS
View on GitHub
[CVPR2026 Findings] VHS: Verifier on Hidden States, an efficient inference-time scaling verification framework for DiT-based image genera…
☆16Mar 25, 2026Updated 3 months ago
javiferran / sae_entities
View on GitHub
☆78Mar 6, 2025Updated last year
MLRM-Halu / MLRM-Halu
View on GitHub
[NeurIPS 2025] More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models
☆82May 31, 2025Updated last year
katiekang1998 / llm_hallucinations
View on GitHub
☆18May 28, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
wizard-III / ArcherCodeR
View on GitHub
ArcherCodeR is an open-source initiative enhancing code reasoning in large language models through scalable, rule-governed reinforcement …
☆44Aug 6, 2025Updated 11 months ago
zjunlp / KnowRL
View on GitHub
KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality
☆48May 19, 2026Updated 2 months ago
lqiang67 / generative-models-on-toys
View on GitHub
generative models on toys
☆12Sep 10, 2024Updated last year
technion-cs-nlp / hallucination-mitigation
View on GitHub
☆23Dec 17, 2024Updated last year
zhumeiqiBUPT / GNN-LF-HF
View on GitHub
WWW2021: Interpreting and Unifying Graph Neural Networks with An Optimization Framework
☆14Jun 23, 2021Updated 5 years ago
ZhangXJ199 / EDGE-GRPO
View on GitHub
Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity
☆22Aug 28, 2025Updated 10 months ago
yunx-z / COMBO
View on GitHub
Merging Generated and Retrieved Knowledge for Open-Domain QA (EMNLP 2023)
☆21Oct 8, 2023Updated 2 years ago
ritzz-ai / PACS
View on GitHub
☆31Sep 12, 2025Updated 10 months ago
thu-coai / BARREL
View on GitHub
[ICLR 2026] BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs
☆18May 21, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
test-time-interaction / TTI
View on GitHub
☆76Jun 10, 2025Updated last year
open-compass / RePro
View on GitHub
[ICLR 2026] Rectifying LLM Thought From Lens of Optimization
☆15Dec 5, 2025Updated 7 months ago
Junjie-Ye / MulDimIF
View on GitHub
[ACL 2026] A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models
☆23Jul 10, 2026Updated last week
zacswolf / Stockformer2022
View on GitHub
Initially a fork of the GitHub repository for the paper "Informer" accepted by AAAI 2021. Heavily modified since then.
☆15Apr 7, 2023Updated 3 years ago
googleinterns / localizing-paragraph-memorization
View on GitHub
☆15Feb 21, 2024Updated 2 years ago
xhwang22 / Awesome-Reward-Hacking
View on GitHub
A curated list of papers and resources on Reward Hacking, Emergent Misalignment, and Proxy Exploitation in Large Models
☆41Apr 17, 2026Updated 3 months ago
YiCheng98 / IntegrativeDecoding
View on GitHub
Official Implementation for the paper "Integrative Decoding: Improving Factuality via Implicit Self-consistency"
☆33Apr 12, 2025Updated last year
HypherX / Evolution-Analysis
View on GitHub
☆25Dec 13, 2024Updated last year
LiuAmber / RAHF
View on GitHub
[ACL 2024 main] Aligning Large Language Models with Human Preferences through Representation Engineering (https://aclanthology.org/2024.…
☆28Sep 25, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
princeton-pli / DySCO
View on GitHub
DySCO: Dynamic Attention-Scaling Decoding for Long-Context LMs
☆17May 30, 2026Updated last month
thu-coai / CDConv
View on GitHub
Data and codes for EMNLP 2022 paper "CDConv: A Benchmark for Contradiction Detection in Chinese Conversations"
☆13May 8, 2023Updated 3 years ago
leileqiTHU / Attacker
View on GitHub
The repo for using the model https://huggingface.co/thu-coai/Attacker-v0.1
☆13Apr 23, 2025Updated last year
controllability / jailbreak-evaluation
View on GitHub
The jailbreak-evaluation is an easy-to-use Python package for language model jailbreak evaluation.
☆27Nov 4, 2024Updated last year
SLIT-AI / ADPA
View on GitHub
[ICLR2025 Spotlight] Advantage-Guided Distillation for Preference Alignment in Small Language Models
☆26Feb 10, 2025Updated last year
wskbest / MFC-Bench
View on GitHub
☆12Oct 17, 2024Updated last year
THU-KEG / VerIF
View on GitHub
[EMNLP 2025] Verification Engineering for RL in Instruction Following
☆57Mar 30, 2026Updated 3 months ago