THU-KEG / DICELinks
DICE: Detecting In-distribution Data Contamination with LLM's Internal State
☆11Updated last year
Alternatives and similar repositories for DICE
Users that are interested in DICE are comparing it to the libraries listed below
Sorting:
- Codebase for Math Neurosurgery: Isolating LLMs' Math Reasoning Abilities Using Only Forward Passes☆20Updated 6 months ago
- ☆12Updated last year
- Code for ProTrix: Building Models for Planning and Reasoning over Tables with Sentence Context☆18Updated last year
- RapidIn: Scalable Influence Estimation for Large Language Models (LLMs). The implementation for paper "Token-wise Influential Training Da…☆21Updated 7 months ago
- Code for the EMNLP24 paper "A simple and effective L2 norm based method for KV Cache compression."☆17Updated last year
- ☆31Updated 10 months ago
- Data and code for the paper: Finding Safety Neurons in Large Language Models☆17Updated last year
- The rule-based evaluation subset and code implementation of Omni-MATH☆25Updated 11 months ago
- ☆29Updated last year
- Code for paper: Long cOntext aliGnment via efficient preference Optimization☆23Updated 2 months ago
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆72Updated 7 months ago
- Official Implementation of "DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucination"☆27Updated last year
- Lightweight Adapting for Black-Box Large Language Models☆24Updated last year
- ☆18Updated 4 months ago
- Official Implementation of UA^{2}-Agent and other baseline algorithms of "Towards Unified Alignment Between Agents, Humans, and Environme…☆19Updated last year
- ☆23Updated last year
- Official Repository for paper "Ontology-Free General-Domain Knowledge Graph-to-Text Generation Dataset Synthesis using Large Language Mod…☆13Updated last year
- Repo for paper "CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models".☆12Updated last year
- Public code repo for COLING 2025 paper "Aligning LLMs with Individual Preferences via Interaction"☆40Updated 8 months ago
- BeHonest: Benchmarking Honesty in Large Language Models☆34Updated last year
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆67Updated last year
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Updated last year
- ☆29Updated last year
- [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style☆72Updated 5 months ago
- [NeurIPS 2024] HonestLLM: Toward an Honest and Helpful Large Language Model☆29Updated 6 months ago
- Evaluate the Quality of Critique☆36Updated last year
- [2025-TMLR] A Survey on the Honesty of Large Language Models☆63Updated last year
- ☆51Updated last year
- Synthesizing realistic and diverse text-datasets from augmented LLMs☆16Updated 8 months ago
- ☆24Updated 8 months ago