The LM Contamination Index is a manually created database of contamination evidences for LMs.
☆81Apr 11, 2024Updated 2 years ago
Alternatives and similar repositories for lm-contamination
Users that are interested in lm-contamination are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.☆13Nov 21, 2023Updated 2 years ago
- Do Multilingual Language Models Think Better in English?☆42Aug 3, 2023Updated 2 years ago
- An implementation of online data mixing for the Pile dataset, based on the GPT-NeoX library.☆14Jan 9, 2024Updated 2 years ago
- ☆22Dec 18, 2024Updated last year
- Official codebase accompanying our ACL 2022 paper "RELiC: Retrieving Evidence for Literary Claims" (https://relic.cs.umass.edu).☆20May 14, 2022Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The official repository for the paper "From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning".☆65Apr 18, 2023Updated 3 years ago
- ☆12Jun 5, 2024Updated 2 years ago
- ZS4IE: A Toolkit for Zero-Shot Information Extraction with Simple Verbalizations☆29Mar 28, 2022Updated 4 years ago
- ☆11Jan 3, 2023Updated 3 years ago
- Code for paper "ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models"☆17Mar 29, 2024Updated 2 years ago
- ☆10Oct 17, 2021Updated 4 years ago