bespokelabsai / verifiersLinks
Verifiers for LLM Reinforcement Learning
☆60Updated 2 months ago
Alternatives and similar repositories for verifiers
Users that are interested in verifiers are comparing it to the libraries listed below
Sorting:
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆17Updated last month
- ☆24Updated 9 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- ☆48Updated last week
- official implementation of paper "Process Reward Model with Q-value Rankings"☆59Updated 4 months ago
- ☆65Updated 2 months ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆27Updated 2 months ago
- Aioli: A unified optimization framework for language model data mixing☆27Updated 5 months ago
- ☆32Updated last month
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆51Updated 6 months ago
- ☆51Updated 7 months ago
- [ICML 2025] Predictive Data Selection: The Data That Predicts Is the Data That Teaches☆47Updated 3 months ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆25Updated 3 months ago
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆48Updated last year
- ☆68Updated 3 months ago
- ☆115Updated 4 months ago
- Scaling RL on advanced reasoning models☆100Updated this week
- ☆27Updated 2 weeks ago
- ☆36Updated last week
- ☆47Updated 4 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆32Updated 3 months ago
- Official Repo for InSTA: Towards Internet-Scale Training For Agents☆42Updated this week
- Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"☆20Updated 2 weeks ago
- DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents☆82Updated this week
- Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya☆112Updated last month
- ☆47Updated 3 weeks ago
- Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval☆39Updated 7 months ago
- Critique-out-Loud Reward Models☆66Updated 8 months ago
- Simple repository for training small reasoning models☆31Updated 4 months ago