mlfoundations / evalchemy
Automatic evals for LLMs
☆334Updated this week
Alternatives and similar repositories for evalchemy:
Users that are interested in evalchemy are comparing it to the libraries listed below
- ☆501Updated 4 months ago
- Official repository for ORPO☆444Updated 9 months ago
- [ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning☆349Updated 6 months ago
- A simple unified framework for evaluating LLMs☆204Updated 2 weeks ago
- RewardBench: the first evaluation tool for reward models.☆526Updated 3 weeks ago
- The official evaluation suite and dynamic data release for MixEval.☆233Updated 4 months ago
- Benchmarking LLMs with Challenging Tasks from Real Users☆218Updated 4 months ago
- [ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data …☆659Updated this week
- awesome synthetic (text) datasets☆264Updated 4 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆1,313Updated this week
- Reproducible, flexible LLM evaluations☆176Updated 3 months ago
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym☆402Updated last week
- FuseAI Project☆547Updated last month
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆453Updated last year
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.☆705Updated 5 months ago
- PyTorch building blocks for the OLMo ecosystem☆165Updated this week
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.☆405Updated 11 months ago
- Generative Representational Instruction Tuning☆610Updated last week
- An Open Source Toolkit For LLM Distillation☆540Updated 2 months ago
- 🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.☆224Updated last week
- Code for Quiet-STaR☆721Updated 7 months ago
- ☆307Updated 9 months ago
- OLMoE: Open Mixture-of-Experts Language Models☆690Updated last week
- A library for easily merging multiple LLM experts, and efficiently train the merged LLM.☆454Updated 6 months ago
- ☆263Updated 7 months ago
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆194Updated this week
- Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"☆229Updated last month
- ☆160Updated 7 months ago