willccbb / verifiersLinks
Verifiers for LLM Reinforcement Learning
☆1,057Updated this week
Alternatives and similar repositories for verifiers
Users that are interested in verifiers are comparing it to the libraries listed below
Sorting:
- Recipes to scale inference-time compute of open models☆1,087Updated last week
- ☆1,024Updated 5 months ago
- procedural reasoning datasets☆625Updated this week
- RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.☆1,908Updated this week
- Training Large Language Model to Reason in a Continuous Latent Space☆1,135Updated 4 months ago
- Synthetic data curation for post-training and structured data extraction☆1,364Updated this week
- A bibliography and survey of the papers surrounding o1☆1,193Updated 6 months ago
- Understanding R1-Zero-Like Training: A Critical Perspective☆956Updated last week
- Automatic evals for LLMs☆399Updated this week
- ☆554Updated last month
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆1,563Updated last week
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering☆728Updated 2 weeks ago
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆474Updated 3 weeks ago
- Build your own visual reasoning model☆370Updated this week
- ☆934Updated 4 months ago
- System 2 Reasoning Link Collection☆834Updated 2 months ago
- MLGym A New Framework and Benchmark for Advancing AI Research Agents☆499Updated 2 weeks ago
- LIMO: Less is More for Reasoning☆953Updated last month
- An Open Large Reasoning Model for Real-World Solutions☆1,494Updated this week
- Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆530Updated 2 months ago
- Large Reasoning Models☆804Updated 5 months ago
- SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learning☆343Updated last week
- 🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.☆367Updated last week
- ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning☆888Updated 2 weeks ago
- Code and Data for Tau-Bench☆528Updated 4 months ago