THU-KEG/Agentic-Reward-Modeling

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/THU-KEG/Agentic-Reward-Modeling)

THU-KEG / Agentic-Reward-Modeling

[ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems

☆134

Alternatives and similar repositories for Agentic-Reward-Modeling

Users that are interested in Agentic-Reward-Modeling are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

THU-KEG / VerIF
View on GitHub
[EMNLP 2025] Verification Engineering for RL in Instruction Following
☆57Mar 30, 2026Updated 3 months ago
THU-KEG / Crab
View on GitHub
[CIKM 2025] Constraint Back-translation Improves Complex Instruction Following of Large Language Models
☆18May 23, 2025Updated last year
THU-KEG / AtomR
View on GitHub
[KDD 2025] AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning
☆15May 27, 2025Updated last year
THU-KEG / WildReward
View on GitHub
Code for paper "WildReward: Learning Reward Models from In-the-Wild Human Interactions"
☆23Feb 26, 2026Updated 4 months ago
THU-KEG / Event-Level-Knowledge-Editing
View on GitHub
☆12Apr 25, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
linjh1118 / Llama3-Chinese-ORPO
View on GitHub
基于Llama3，通过进一步CPT，SFT，ORPO得到的中文版Llama3
☆16Apr 24, 2024Updated 2 years ago
Aloriosa / srmt
View on GitHub
The original Shared Recurrent Memory Transformer implementation
☆36Jul 11, 2025Updated last year
THU-KEG / ADELIE
View on GitHub
[EMNLP2024] Aligning Large Language Models on Information Extraction
☆56Nov 4, 2024Updated last year
wzhouad / WPO
View on GitHub
Code and models for EMNLP 2024 paper "WPO: Enhancing RLHF with Weighted Preference Optimization"
☆41Sep 24, 2024Updated last year
JiazhengZhang / AgentV-RL
View on GitHub
☆15Apr 17, 2026Updated 3 months ago
THU-KEG / AgentIF
View on GitHub
[NIPS 2025 DB Spotlight] AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios
☆39Dec 1, 2025Updated 7 months ago
kyle8581 / LanguageModelsasCompilers
View on GitHub
Official implementation of Language Models as Compilers: Simulating the Execution Of Pseudocode Improves Algorithmic Reasoning in Languag…
☆23Apr 8, 2024Updated 2 years ago
THU-KEG / RM-Bench
View on GitHub
[ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
☆84Jul 18, 2025Updated last year
RUCAIBox / OlymMATH
View on GitHub
The OlymMATH dataset
☆24Jun 1, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
lichengliu03 / unary-feedback
View on GitHub
☆44Mar 31, 2026Updated 3 months ago
liyongqi2002 / Email_Client
View on GitHub
从socket开始实现pop3和smtp客户端，实现邮件编写、发送、接收、阅读、删除等基本功能。并实现简单界面（PyQt5）Start from socket to implement pop3 and smtp clients, to realize the basic …
☆12Dec 24, 2023Updated 2 years ago
THU-KEG / COPEN
View on GitHub
The official code and dataset for EMNLP 2022 paper "COPEN: Probing Conceptual Knowledge in Pre-trained Language Models".
☆21Mar 9, 2023Updated 3 years ago
McGill-NLP / agent-reward-bench
View on GitHub
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
☆47Aug 7, 2025Updated 11 months ago
THU-Team-Eureka / EurekAgent
View on GitHub
EurekAgent: an autonomous research system for metric-driven tasks, built with Claude Code. Define the problem and metric. Get breakthroug…
☆73Updated this week
Open-Social-World / autolibra
View on GitHub
AutoLibra: Metric Induction for Agents from Open-Ended Human Feedback
☆19Apr 23, 2026Updated 2 months ago
linjh1118 / WisdoMentor
View on GitHub
WisdoMentor - Series: A LLM for undergraduates | 博导智言(辅助大学生学习)
☆13May 9, 2024Updated 2 years ago
Rainier-rq / verl-if
View on GitHub
Official implementation of the paper "Instructions are all you need: Self-supervised Reinforcement Learning for Instruction Following"
☆40Jan 11, 2026Updated 6 months ago
martenlienen / bsi
View on GitHub
Generative Modeling with Bayesian Sample Inference
☆24May 17, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
LIFEBench / LIFEBench
View on GitHub
LIFEBENCH: Evaluating Length Instruction Following in Large Language Models
☆18Apr 23, 2026Updated 2 months ago
mandyyyyii / east
View on GitHub
☆19Aug 4, 2025Updated 11 months ago
martin-wey / CodeUltraFeedback
View on GitHub
CodeUltraFeedback: aligning large language models to coding preferences (TOSEM 2025)
☆76Jun 25, 2024Updated 2 years ago
yuleiqin / RAIF
View on GitHub
A Recipe for Building LLM Reasoners to Solve Complex Instructions
☆32Oct 9, 2025Updated 9 months ago
THU-KEG / OpenSAE
View on GitHub
☆47Apr 12, 2026Updated 3 months ago
RUCAIBox / R1-Searcher
View on GitHub
R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
☆720Aug 5, 2025Updated 11 months ago
PrimeIntellect-ai / genesys
View on GitHub
☆138Mar 20, 2025Updated last year
limenlp / safer-instruct
View on GitHub
This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"
☆17Feb 22, 2024Updated 2 years ago
facebookresearch / multimodal_rewardbench
View on GitHub
Multimodal RewardBench
☆68Feb 21, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Lagooon / LeanSTaR
View on GitHub
☆44Sep 19, 2024Updated last year
OSU-NLP-Group / Mind2Web-2
View on GitHub
[NeurIPS'25 D&B] Mind2Web-2 Benchmark: Evaluating Agentic Search with Agent-as-a-Judge
☆111May 17, 2026Updated 2 months ago
kkk-an / UltraIF
View on GitHub
Code of EMNLP 2025 paper 'UltraIF: Advancing Instruction Following from the Wild'.
☆21Apr 3, 2025Updated last year
RyanLiu112 / GenPRM
View on GitHub
[AAAI 2026] Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".
☆102Nov 8, 2025Updated 8 months ago
LAMDA-NeSy / Self-Backtracking
View on GitHub
☆52Feb 12, 2025Updated last year
TonySY2 / AgentDropoutV2
View on GitHub
☆27May 27, 2026Updated last month
zjunlp / KnowRL
View on GitHub
KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality
☆48May 19, 2026Updated 2 months ago