Codebase for LLM Textual Hallucination Benchmark
☆74Apr 25, 2025Updated 10 months ago
Alternatives and similar repositories for HalluLens
Users that are interested in HalluLens are comparing it to the libraries listed below
Sorting:
- ☆16Jun 25, 2025Updated 8 months ago
- ☆33Dec 17, 2025Updated 2 months ago
- FactScoreLite is an implementation of the FactScore metric, designed for detailed accuracy assessment in text generation. This package bu…☆13Apr 25, 2024Updated last year
- [ICLR 2026] BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs☆17May 21, 2025Updated 9 months ago
- ☆61Nov 10, 2025Updated 3 months ago
- Source code for paper "On the Pareto Front of Multilingual Neural Machine Translation" @ NeurIPS 2023☆16Sep 27, 2023Updated 2 years ago
- Code for NAACL 2025 paper "AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge"☆16Updated this week
- FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data (NAACL 2025)☆15Jul 14, 2025Updated 7 months ago
- BigKnow2022: Bringing Language Models Up to Speed☆16Mar 27, 2023Updated 2 years ago
- FrugalScore is an approach to learn a fixed, low cost version of any expensive NLG metric, while retaining most of its original performan…☆16Sep 21, 2022Updated 3 years ago
- ☆20Apr 10, 2025Updated 10 months ago
- ☆17May 28, 2024Updated last year
- ☆43Aug 23, 2023Updated 2 years ago
- Codes for our CCL 2021 paper: Incorporating Commonsense Knowledge into Abstractive Dialogue Summarization via Heterogeneous Graph Network…☆26Jul 28, 2021Updated 4 years ago
- Code repo of our paper Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis (https://arxiv.org/abs/2406.10794…☆23Jul 26, 2024Updated last year
- RENT (Reinforcement Learning via Entropy Minimization) is an unsupervised method for training reasoning LLMs.☆41Oct 31, 2025Updated 4 months ago
- Merging Generated and Retrieved Knowledge for Open-Domain QA (EMNLP 2023)☆22Oct 8, 2023Updated 2 years ago
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆63Dec 25, 2023Updated 2 years ago
- Attack AlphaZero Go agents (NeurIPS 2022)☆22Dec 3, 2022Updated 3 years ago
- Code for paper Towards Mitigating LLM Hallucination via Self Reflection☆30Oct 9, 2023Updated 2 years ago
- ☆23Nov 20, 2021Updated 4 years ago
- RewardAnything: Generalizable Principle-Following Reward Models☆45Jun 11, 2025Updated 8 months ago
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆76Jan 16, 2026Updated last month
- [NeurIPS 2024] Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling☆34Nov 8, 2024Updated last year
- The repository for ACL 2024 paper "TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models"☆34Jun 29, 2024Updated last year
- Code for ICML 2024 paper☆35Sep 18, 2025Updated 5 months ago
- ☆17Sep 10, 2025Updated 5 months ago
- Code for "APTBench: Benchmarking Agentic Potential of Base LLMs During Pre-Training"☆38Dec 23, 2025Updated 2 months ago
- ☆12May 6, 2022Updated 3 years ago
- A holistic benchmark for LLM abstention☆71Aug 27, 2025Updated 6 months ago
- Token-level Reference-free Hallucination Detection☆97Jul 25, 2023Updated 2 years ago
- ☆52Jul 31, 2024Updated last year
- 3D classification and object segmentation on LiDAR data with Deep Learning (CCIA2022)☆12Dec 15, 2023Updated 2 years ago
- Pytorch version of Continuous Language Generative Flow (ACL 2021)☆11Sep 14, 2021Updated 4 years ago
- 用Paddle复现论文ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information(ACL2021)☆10Nov 15, 2021Updated 4 years ago
- Embodied-Planner-R1: Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning☆25Jan 5, 2026Updated 2 months ago
- grpo to train long form QA and instructions with long-form reward model☆17Jul 17, 2025Updated 7 months ago
- Implementation of MetaVQA.☆12Jul 3, 2021Updated 4 years ago
- ☆45Jan 21, 2025Updated last year