The Paper List on Data Contamination for Large Language Models Evaluation.
☆112Jan 29, 2026Updated 2 months ago
Alternatives and similar repositories for awesome-data-contamination
Users that are interested in awesome-data-contamination are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An open-source library for contamination detection in NLP datasets and Large Language Models (LLMs).☆60Aug 13, 2024Updated last year
- ☆16Nov 26, 2024Updated last year
- DICE: Detecting In-distribution Data Contamination with LLM's Internal State☆11Sep 21, 2024Updated last year
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆82Apr 11, 2024Updated last year
- ☆23Dec 18, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- SVIP: Towards Verifiable Inference of Open-Source Large Language Models☆14Jun 3, 2025Updated 9 months ago
- Latest Evaluation Toolkit (LatestEval). Assessing the language models with latest, uncontaminated materials.☆29Feb 17, 2025Updated last year
- BeHonest: Benchmarking Honesty in Large Language Models☆34Aug 15, 2024Updated last year
- [ICLR 2025] Permute-and-Flip: An optimally robust and watermarkable decoder for LLMs☆19Mar 20, 2025Updated last year
- Longitudinal Evaluation of LLMs via Data Compression☆33May 29, 2024Updated last year
- This repository provides an original implementation of Detecting Pretraining Data from Large Language Models by *Weijia Shi, *Anirudh Aji…☆241Nov 3, 2023Updated 2 years ago
- Code for LLM_Catastrophic_Forgetting via SAM.☆11Jun 7, 2024Updated last year
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆78Jan 16, 2026Updated 2 months ago
- An implementation of Scalable Evaluation and Improvement of Document Set Expansion via Neural Positive-Unlabeled Learning without AllenNL…☆19Feb 20, 2024Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆19Oct 24, 2023Updated 2 years ago
- Paper list for the paper "Authorship Attribution in the Era of Large Language Models: Problems, Methodologies, and Challenges (SIGKDD Exp…☆18Mar 17, 2026Updated last week
- 校园帮需求分析报告☆10Apr 4, 2020Updated 5 years ago
- Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…☆29May 23, 2024Updated last year
- ☆36Mar 10, 2025Updated last year
- An automated feature engineering framework 'FETCH' accepted in ICLR 2023.☆11Jun 20, 2023Updated 2 years ago
- Code and data for NAACL 2025 paper "IHEval: Evaluating Language Models on Following the Instruction Hierarchy"☆16Feb 25, 2025Updated last year
- Codes and data for EMNLP 2021 paper "Self- and Pseudo-self-supervised Prediction of Speaker and Key-utterance for Multi-party Dialogue Re…☆16Oct 15, 2022Updated 3 years ago
- A simple unified framework for evaluating LLMs☆267Apr 14, 2025Updated 11 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A bibliography and survey of the papers surrounding o1☆1,213Nov 16, 2024Updated last year
- Official Repository for Can Language Models be Instructed to Protect Personal Information?☆13Oct 8, 2023Updated 2 years ago
- Code for ICLR 2022 paper Rethinking Goal-Conditioned Supervised Learning and Its Connection to Offline RL.☆28Feb 21, 2022Updated 4 years ago
- Replication package of the ICSE2025 paper titled "Leveraging Large Language Models for Enhancing the Understandability of Generated Unit …☆11Feb 19, 2025Updated last year
- ☆13Aug 7, 2025Updated 7 months ago
- ☆42Nov 7, 2023Updated 2 years ago
- Holistic Evaluation of Language Models (HELM) is an open source Python framework created by the Center for Research on Foundation Models …☆2,722Updated this week
- ☆10Nov 16, 2024Updated last year
- Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning☆29Sep 12, 2025Updated 6 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Ablation Capability for Large Vision-Language Models☆16Sep 27, 2024Updated last year
- Cross-field empirical trends analysis of XAI literature☆22Sep 28, 2023Updated 2 years ago
- Implementation of Variational Hierarchical User-based Conversation Model☆10Jul 2, 2021Updated 4 years ago
- Code for verifying deep neural feature ansatz☆22May 3, 2023Updated 2 years ago
- Source code for the paper "CAT: Interpretable Concept-based Taylor Additive Models".☆18Aug 26, 2024Updated last year
- Paper list for the survey "Combating Misinformation in the Age of LLMs: Opportunities and Challenges" and the initiative "LLMs Meet Misin…☆106Nov 9, 2024Updated last year
- ☆13Oct 19, 2023Updated 2 years ago