The Paper List on Data Contamination for Large Language Models Evaluation.
☆115Jun 2, 2026Updated last month
Alternatives and similar repositories for awesome-data-contamination
Users that are interested in awesome-data-contamination are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An open-source library for contamination detection in NLP datasets and Large Language Models (LLMs).☆61Aug 13, 2024Updated last year
- ☆16Nov 26, 2024Updated last year
- [ACL 2025] Official code for ''Learning to Reason from Feedback at Test-Time''.☆13May 16, 2025Updated last year
- [ICCV 2025] "Fine-grained Spatiotemporal Grounding on Egocentric Videos"☆26Nov 23, 2025Updated 7 months ago
- DICE: Detecting In-distribution Data Contamination with LLM's Internal State☆12Sep 21, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆81Apr 11, 2024Updated 2 years ago
- ☆23Dec 18, 2024Updated last year
- Latest Evaluation Toolkit (LatestEval). Assessing the language models with latest, uncontaminated materials.☆29Feb 17, 2025Updated last year
- BeHonest: Benchmarking Honesty in Large Language Models☆35Aug 15, 2024Updated last year
- [ICLR 2025] Permute-and-Flip: An optimally robust and watermarkable decoder for LLMs☆19Mar 20, 2025Updated last year
- Xlore2.0 Code[BaiduExtractor, HudongExtractor, WikiExtractor, XloreData, XloreWeb]☆12Apr 5, 2017Updated 9 years ago
- Official code for ICLR 2024 paper, SEABO: A Simple Search-Based Method for Offline Imitation Learning☆12Jan 19, 2024Updated 2 years ago
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆324Dec 20, 2023Updated 2 years ago
- This repository provides an original implementation of Detecting Pretraining Data from Large Language Models by *Weijia Shi, *Anirudh Aji…☆243Nov 3, 2023Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Code for LLM_Catastrophic_Forgetting via SAM.☆11Jun 7, 2024Updated 2 years ago
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆81Jun 20, 2026Updated last week
- A framework for benchmarking embedding models in hybrid search scenarios (BM25 + vector search) using Weaviate.☆40Jun 17, 2026Updated 2 weeks ago
- ☆19Oct 24, 2023Updated 2 years ago
- Paper list for the paper "Authorship Attribution in the Era of Large Language Models: Problems, Methodologies, and Challenges (SIGKDD Exp…☆19May 25, 2026Updated last month
- 校园帮需求分析报告☆10Apr 4, 2020Updated 6 years ago
- ☆55May 9, 2025Updated last year
- Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…☆31May 23, 2024Updated 2 years ago
- Code and data for NAACL 2025 paper "IHEval: Evaluating Language Models on Following the Instruction Hierarchy"☆17Feb 25, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Used for thinking process intervention of reasoning models such as DeepSeek-R1, effectively controlling the reasoning thinking process. 用…☆24Apr 14, 2025Updated last year
- Codes and data for EMNLP 2021 paper "Self- and Pseudo-self-supervised Prediction of Speaker and Key-utterance for Multi-party Dialogue Re…☆16Oct 15, 2022Updated 3 years ago
- A simple unified framework for evaluating LLMs☆271Apr 14, 2025Updated last year
- A bibliography and survey of the papers surrounding o1☆1,213Nov 16, 2024Updated last year
- Official Repository for Can Language Models be Instructed to Protect Personal Information?☆13Oct 8, 2023Updated 2 years ago
- Code for ICLR 2022 paper Rethinking Goal-Conditioned Supervised Learning and Its Connection to Offline RL.☆27Feb 21, 2022Updated 4 years ago
- Official Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].☆37Nov 2, 2024Updated last year
- A survey on harmful fine-tuning attack for large language model (ACM CSUR)☆246Jun 22, 2026Updated last week
- ☆42Nov 7, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Holistic Evaluation of Language Models (HELM) is an open source Python framework created by the Center for Research on Foundation Models …☆2,840Jun 5, 2026Updated 3 weeks ago
- Cross-field empirical trends analysis of XAI literature☆22Sep 28, 2023Updated 2 years ago
- ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Ablation Capability for Large Vision-Language Models☆16Sep 27, 2024Updated last year
- Implementation of Variational Hierarchical User-based Conversation Model☆10Jul 2, 2021Updated 5 years ago
- Code for verifying deep neural feature ansatz☆22May 3, 2023Updated 3 years ago
- Official code for "Rethinking Chain-of-Thought Reasoning for Videos"☆21Dec 14, 2025Updated 6 months ago
- Source code for the paper "CAT: Interpretable Concept-based Taylor Additive Models".☆18Aug 26, 2024Updated last year