The Paper List on Data Contamination for Large Language Models Evaluation.
☆115Jan 29, 2026Updated 2 months ago
Alternatives and similar repositories for awesome-data-contamination
Users that are interested in awesome-data-contamination are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An open-source library for contamination detection in NLP datasets and Large Language Models (LLMs).☆61Aug 13, 2024Updated last year
- ☆16Nov 26, 2024Updated last year
- DICE: Detecting In-distribution Data Contamination with LLM's Internal State☆11Sep 21, 2024Updated last year
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆82Apr 11, 2024Updated 2 years ago
- SVIP: Towards Verifiable Inference of Open-Source Large Language Models☆15Jun 3, 2025Updated 10 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Latest Evaluation Toolkit (LatestEval). Assessing the language models with latest, uncontaminated materials.☆29Feb 17, 2025Updated last year
- BeHonest: Benchmarking Honesty in Large Language Models☆35Aug 15, 2024Updated last year
- Xlore2.0 Code[BaiduExtractor, HudongExtractor, WikiExtractor, XloreData, XloreWeb]☆12Apr 5, 2017Updated 9 years ago
- Longitudinal Evaluation of LLMs via Data Compression☆33May 29, 2024Updated last year
- Official code for ICLR 2024 paper, SEABO: A Simple Search-Based Method for Offline Imitation Learning☆12Jan 19, 2024Updated 2 years ago
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆319Dec 20, 2023Updated 2 years ago
- This repository provides an original implementation of Detecting Pretraining Data from Large Language Models by *Weijia Shi, *Anirudh Aji…☆244Nov 3, 2023Updated 2 years ago
- Code for LLM_Catastrophic_Forgetting via SAM.☆11Jun 7, 2024Updated last year
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆79Jan 16, 2026Updated 3 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A framework for benchmarking embedding models in hybrid search scenarios (BM25 + vector search) using Weaviate.☆38Updated this week
- An implementation of Scalable Evaluation and Improvement of Document Set Expansion via Neural Positive-Unlabeled Learning without AllenNL…☆19Feb 20, 2024Updated 2 years ago
- ☆19Oct 24, 2023Updated 2 years ago
- Paper list for the paper "Authorship Attribution in the Era of Large Language Models: Problems, Methodologies, and Challenges (SIGKDD Exp…☆19Apr 5, 2026Updated 2 weeks ago
- ☆43May 9, 2025Updated 11 months ago
- Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…☆30May 23, 2024Updated last year
- Code and data for NAACL 2025 paper "IHEval: Evaluating Language Models on Following the Instruction Hierarchy"☆17Feb 25, 2025Updated last year
- Codes and data for EMNLP 2021 paper "Self- and Pseudo-self-supervised Prediction of Speaker and Key-utterance for Multi-party Dialogue Re…☆16Oct 15, 2022Updated 3 years ago
- Code for ICLR 2022 paper Rethinking Goal-Conditioned Supervised Learning and Its Connection to Offline RL.☆28Feb 21, 2022Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A survey on harmful fine-tuning attack for large language model (ACM CSUR)☆238Feb 25, 2026Updated last month
- ☆42Nov 7, 2023Updated 2 years ago
- Holistic Evaluation of Language Models (HELM) is an open source Python framework created by the Center for Research on Foundation Models …☆2,754Updated this week
- ☆10Nov 16, 2024Updated last year
- Cross-field empirical trends analysis of XAI literature☆22Sep 28, 2023Updated 2 years ago
- Source code for the paper "CAT: Interpretable Concept-based Taylor Additive Models".☆18Aug 26, 2024Updated last year
- Collection of latest papers and materials in the area of RLVR!☆90Updated this week
- ☆13Oct 19, 2023Updated 2 years ago
- [NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs☆98Nov 17, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆17Jul 6, 2023Updated 2 years ago
- A simple agent powered by LLMs that performs tasks.☆14Apr 25, 2025Updated 11 months ago
- Llemma formal2formal (tactic prediction) theorem proving experiments☆20Oct 17, 2023Updated 2 years ago
- Replicating O1 inference-time scaling laws☆93Dec 1, 2024Updated last year
- [ICLR 2024] DMBP: Diffusion Model-Based Predictor for Robust Offline Reinforcement Learning against State Observations Perturbations.☆17May 24, 2024Updated last year
- [ICLR 2025] Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist☆35Oct 23, 2024Updated last year
- The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism☆30Jul 17, 2024Updated last year