THU-KEG / DICELinks
DICE: Detecting In-distribution Data Contamination with LLM's Internal State
☆11Updated last year
Alternatives and similar repositories for DICE
Users that are interested in DICE are comparing it to the libraries listed below
Sorting:
- Codebase for Math Neurosurgery: Isolating LLMs' Math Reasoning Abilities Using Only Forward Passes☆21Updated 7 months ago
- The rule-based evaluation subset and code implementation of Omni-MATH☆26Updated last year
- Code for ProTrix: Building Models for Planning and Reasoning over Tables with Sentence Context☆18Updated last year
- [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style☆73Updated 6 months ago
- ☆51Updated last year
- Code for the EMNLP24 paper "A simple and effective L2 norm based method for KV Cache compression."☆18Updated last year
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆39Updated 2 years ago
- ☆32Updated 11 months ago
- Official Implementation of "DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucination"☆27Updated last year
- Lightweight Adapting for Black-Box Large Language Models☆24Updated last year
- [NeurIPS 2025@FoRLM] R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search☆17Updated last week
- [EMNLP 2024] The official GitHub repo for the paper "Course-Correction: Safety Alignment Using Synthetic Preferences"☆20Updated last year
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆73Updated 9 months ago
- ☆46Updated 3 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆120Updated 8 months ago
- RapidIn: Scalable Influence Estimation for Large Language Models (LLMs). The implementation for paper "Token-wise Influential Training Da…☆21Updated 8 months ago
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆29Updated last year
- BeHonest: Benchmarking Honesty in Large Language Models☆34Updated last year
- Data and code for the paper: Finding Safety Neurons in Large Language Models☆20Updated this week
- ☆25Updated 9 months ago
- ☆18Updated 6 months ago
- The official repository of NeurIPS'25 paper "Ada-R1: From Long-Cot to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization"☆21Updated 2 months ago
- ☆22Updated last year
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Updated last year
- ☆12Updated last year
- Repo for paper "CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models".☆12Updated last year
- ☆46Updated 10 months ago
- ☆23Updated last year
- Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to …☆57Updated this week
- [2025-TMLR] A Survey on the Honesty of Large Language Models☆64Updated last year