bespokelabsai / verifiersLinks

Verifiers for LLM Reinforcement Learning

☆80

Alternatives and similar repositories for verifiers

Users that are interested in verifiers are comparing it to the libraries listed below

Sorting:

du-nlp-lab / MLR-Copilot
☆67Updated 8 months ago
THU-KEG / Agentic-Reward-Modeling
[ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems
☆115Updated 5 months ago
allenai / IFBench
☆92Updated last week
ContextualAI / CLAIR_and_APO
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
☆60Updated last year
GAIR-NLP / scaleeval
Scalable Meta-Evaluation of LLMs as Evaluators
☆43Updated last year
salesforce / summary-of-a-haystack
Codebase accompanying the Summary of a Haystack paper.
☆79Updated last year
austrian-code-wizard / c3po
☆29Updated 3 months ago
dinobby / MAgICoRE
☆24Updated last year
arcee-ai / DAM
☆55Updated last year
AlexCuadron / ThinkingAgent
Systematic evaluation framework that automatically rates overthinking behavior in large language models.
☆94Updated 6 months ago
ryokamoi / llm-self-correction-papers
List of papers on Self-Correction of LLMs.
☆80Updated 11 months ago
sher222 / LeReT
Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval
☆51Updated last year
para-lost / ReBase
ReBase: Training Task Experts through Retrieval Based Distillation
☆29Updated 9 months ago
sanyalsunny111 / LLM-Inheritune
This is the official repository for Inheritune.
☆115Updated 9 months ago
schauppi / Self-Rewarding-Language-Models
☆48Updated last year
facebookresearch / collaborative-reasoner
Source code for the collaborative reasoner research project at Meta FAIR.
☆110Updated 7 months ago
mukhal / ThinkPRM
Process Reward Models That Think
☆63Updated last month
WindyLee0822 / Process_Q_Model
official implementation of paper "Process Reward Model with Q-value Rankings"
☆65Updated 9 months ago
oriyor / assistantbench
Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"
☆66Updated 11 months ago
JHU-CLSP / RATIONALYST
Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044
☆35Updated last year
sunblaze-ucb / reasoning_ladder
☆35Updated 6 months ago
zai-org / ComplexFuncBench
Complex Function Calling Benchmark.
☆149Updated 10 months ago
Anni-Zou / Meta-CoT
Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models
☆99Updated 2 years ago
SalesforceAIResearch / LaTRO
☆124Updated 9 months ago
GAIR-NLP / Entropy-ABF
Official implementation for 'Extending LLMs’ Context Window with 100 Samples'
☆81Updated last year
hamishivi / EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…
☆76Updated last year
LiqiangJing / DSBench
[ICLR 2025] DSBench: How Far are Data Science Agents from Becoming Data Science Experts?
☆86Updated 3 months ago
InternLM / SWE-Fixer
☆127Updated 6 months ago
letta-ai / sleep-time-compute
accompanying material for sleep-time compute paper
☆117Updated 7 months ago
TRI-ML / linear_open_lm
A repository for research on medium sized language models.
☆78Updated last year