willccbb / verifiersLinks
Verifiers for LLM Reinforcement Learning
☆1,495Updated this week
Alternatives and similar repositories for verifiers
Users that are interested in verifiers are comparing it to the libraries listed below
Sorting:
- procedural reasoning datasets☆938Updated this week
- Recipes to scale inference-time compute of open models☆1,101Updated last month
- Synthetic data curation for post-training and structured data extraction☆1,434Updated this week
- Training Large Language Model to Reason in a Continuous Latent Space☆1,185Updated 5 months ago
- ☆1,025Updated 6 months ago
- System 2 Reasoning Link Collection☆843Updated 3 months ago
- A bibliography and survey of the papers surrounding o1☆1,205Updated 7 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆1,722Updated this week
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆535Updated this week
- RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.☆2,111Updated last week
- Automatic evals for LLMs☆461Updated 2 weeks ago
- [COLM 2025] LIMO: Less is More for Reasoning☆977Updated this week
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering☆795Updated 3 weeks ago
- SkyRL: A Modular Full-stack RL Library for LLMs☆574Updated this week
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆498Updated 2 months ago
- ☆585Updated 2 months ago
- Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement…☆1,108Updated this week
- Code and Data for Tau-Bench☆657Updated 5 months ago
- MLGym A New Framework and Benchmark for Advancing AI Research Agents☆529Updated 2 weeks ago
- Textbook on reinforcement learning from human feedback☆1,083Updated this week
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL☆2,817Updated this week
- Understanding R1-Zero-Like Training: A Critical Perspective☆1,023Updated last week
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆2,800Updated this week
- Pretraining code for a large-scale depth-recurrent language model☆793Updated last month
- An Open Source Toolkit For LLM Distillation☆669Updated last month
- Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆566Updated 3 months ago
- A reading list on LLM based Synthetic Data Generation 🔥☆1,338Updated last month
- Democratizing Reinforcement Learning for LLMs☆3,744Updated this week
- ☆2,136Updated this week
- Build your own visual reasoning model☆395Updated this week