PrimeIntellect-ai / verifiersLinks
Our library for RL environments + evals
☆3,730Updated this week
Alternatives and similar repositories for verifiers
Users that are interested in verifiers are comparing it to the libraries listed below
Sorting:
- Post-training with Tinker☆2,699Updated this week
- [NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards☆1,304Updated 3 weeks ago
- Synthetic data curation for post-training and structured data extraction☆1,594Updated last week
- Training Large Language Model to Reason in a Continuous Latent Space☆1,449Updated 5 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆2,233Updated last week
- Recipes to scale inference-time compute of open models☆1,123Updated 7 months ago
- Textbook on reinforcement learning from human feedback☆1,396Updated this week
- RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.☆2,470Updated this week
- SkyRL: A Modular Full-stack RL Library for LLMs☆1,437Updated this week
- Democratizing Reinforcement Learning for LLMs☆4,965Updated this week
- Async RL Training at Scale☆985Updated this week
- NanoGPT (124M) in 3 minutes☆4,116Updated this week
- AllenAI's post-training codebase☆3,515Updated this week
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆782Updated last week
- Minimalistic 4D-parallelism distributed training framework for education purpose☆1,939Updated 4 months ago
- System 2 Reasoning Link Collection☆865Updated 9 months ago
- Implementing DeepSeek R1's GRPO algorithm from scratch☆1,740Updated 8 months ago
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard a…☆2,029Updated last month
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆3,039Updated 3 weeks ago
- ☆1,376Updated 4 months ago
- A bibliography and survey of the papers surrounding o1☆1,216Updated last year
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering☆1,270Updated 3 weeks ago
- Scalable RL solution for advanced reasoning of language models☆1,794Updated 9 months ago
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL☆3,793Updated 2 months ago
- Minimalistic large language model 3D-parallelism training☆2,411Updated last month
- Code for BLT research paper☆2,024Updated 2 months ago
- Sky-T1: Train your own O1 preview model within $450☆3,367Updated 6 months ago
- slime is an LLM post-training framework for RL Scaling.☆3,224Updated this week
- ☆2,529Updated this week
- ☆1,032Updated last year