lm-sys / llm-decontaminatorLinks
Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"
☆302Updated last year
Alternatives and similar repositories for llm-decontaminator
Users that are interested in llm-decontaminator are comparing it to the libraries listed below
Sorting:
- Manage scalable open LLM inference endpoints in Slurm clusters☆257Updated 10 months ago
- Benchmarking LLMs with Challenging Tasks from Real Users☆223Updated 6 months ago
- Reproducible, flexible LLM evaluations☆203Updated 3 weeks ago
- ☆517Updated 6 months ago
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆461Updated last year
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆202Updated last year
- DSIR large-scale data selection framework for language model training☆249Updated last year
- A repository for research on medium sized language models.☆495Updated 3 weeks ago
- LOFT: A 1 Million+ Token Long-Context Benchmark☆198Updated last month
- A project to improve skills of large language models☆413Updated this week
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆173Updated 2 months ago
- RewardBench: the first evaluation tool for reward models.☆582Updated this week
- The official evaluation suite and dynamic data release for MixEval.☆241Updated 6 months ago
- Multipack distributed sampler for fast padding-free training of LLMs☆188Updated 9 months ago
- Official repository for ORPO☆453Updated last year
- Experiments on speculative sampling with Llama models☆126Updated last year
- Evaluating LLMs with fewer examples☆156Updated last year
- A simple unified framework for evaluating LLMs☆215Updated last month
- [ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning☆353Updated 8 months ago
- Scaling Data-Constrained Language Models☆334Updated 8 months ago
- [ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"☆410Updated 7 months ago
- 🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.☆367Updated last week
- [EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning☆241Updated last year
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]☆106Updated 3 months ago
- This project studies the performance and robustness of language models and task-adaptation methods.☆150Updated last year
- PyTorch building blocks for the OLMo ecosystem☆222Updated this week
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆333Updated 5 months ago
- The HELMET Benchmark☆148Updated last month
- Pre-training code for Amber 7B LLM☆166Updated last year
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆220Updated 7 months ago