RM-R1-UIUC/RM-R1

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/RM-R1-UIUC/RM-R1)

RM-R1-UIUC / RM-R1

[ICLR'26] RM-R1: Unleashing the Reasoning Potential of Reward Models

☆167

Alternatives and similar repositories for RM-R1

Users that are interested in RM-R1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yfzhang114 / r1_reward
View on GitHub
✨✨ [ICLR 2026] R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
☆291May 9, 2025Updated last year
rubricreward / r3
View on GitHub
R3: Robust Rubric-Agnostic Reward Models
☆23Jul 12, 2025Updated last year
nishadsinghi / sc-genrm-scaling
View on GitHub
[COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…
☆15Oct 31, 2025Updated 8 months ago
RUCBM / DeepCritic
View on GitHub
Official repository for paper "DeepCritic: Deliberate Critique with Large Language Models"
☆41Jun 24, 2025Updated last year
NuoJohnChen / JudgeLRM
View on GitHub
JudgeLRM: Large Reasoning Models as a Judge
☆42May 6, 2026Updated 2 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
xiusic / DecisionFlow
View on GitHub
☆34Aug 26, 2025Updated 10 months ago
Zhou-Zoey / RMB-Reward-Model-Benchmark
View on GitHub
☆48Mar 25, 2025Updated last year
TIGER-AI-Lab / General-Reasoner
View on GitHub
General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]
☆227Nov 27, 2025Updated 7 months ago
PRIME-RL / TTRL
View on GitHub
[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning
☆1,100Apr 15, 2026Updated 3 months ago
xiusic / MinPrompt
View on GitHub
MinPrompt: Graph-based Minimal Prompt Data Augmentation for Few-shot Question Answering
☆14May 3, 2024Updated 2 years ago
bobxwu / learning-from-rewards-llm-papers
View on GitHub
A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward model…
☆73Jun 13, 2025Updated last year
bigai-nlco / RuleReasoner
View on GitHub
[ICLR 2026] RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling
☆39Feb 25, 2026Updated 4 months ago
technion-cs-nlp / hallucination-mitigation
View on GitHub
☆23Dec 17, 2024Updated last year
sunblaze-ucb / reasoning_ladder
View on GitHub
☆35May 16, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
facebookresearch / multimodal_rewardbench
View on GitHub
Multimodal RewardBench
☆68Feb 21, 2025Updated last year
MasterVito / SwS
View on GitHub
Official Repo for SwS: A Weakness-driven Problem Synthesis Framework in RL for LLM Reasoning
☆41Nov 11, 2025Updated 8 months ago
ruixin31 / Spurious_Rewards
View on GitHub
☆361Jul 29, 2025Updated 11 months ago
inclusionAI / Ring
View on GitHub
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI, derived from Ling.
☆109Aug 5, 2025Updated 11 months ago
GAIR-NLP / OctoThinker
View on GitHub
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆189Jul 23, 2025Updated 11 months ago
ASTRAL-Group / AlphaOne
View on GitHub
[EMNLP 2025 Main] AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
☆89Jun 10, 2025Updated last year
ypwang61 / One-Shot-RLVR
View on GitHub
[NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example
☆444Mar 11, 2026Updated 4 months ago
chenllliang / G1
View on GitHub
G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning
☆103May 20, 2025Updated last year
ritzz-ai / PACS
View on GitHub
☆31Sep 12, 2025Updated 10 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
WisdomShell / RewardAnything
View on GitHub
RewardAnything: Generalizable Principle-Following Reward Models
☆44Jun 11, 2025Updated last year
StarDewXXX / AdaR1
View on GitHub
The official repository of NeurIPS'25 paper "Ada-R1: From Long-Cot to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization"
☆24May 6, 2026Updated 2 months ago
THU-KEG / RM-Bench
View on GitHub
[ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
☆84Jul 18, 2025Updated last year
Infini-AI-Lab / GRESO
View on GitHub
☆81Jun 8, 2026Updated last month
jinzhuoran / RAG-RewardBench
View on GitHub
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment
☆18Dec 19, 2024Updated last year
chentong0 / rl-binary-rar
View on GitHub
Official repo for "Binary Retrieval-augmented Reward Mitigates Hallucinations"
☆15Nov 13, 2025Updated 8 months ago
JingMog / THOR
View on GitHub
[ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".
☆33Feb 26, 2026Updated 4 months ago
microsoft / x-reasoner
View on GitHub
X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains
☆49Feb 4, 2026Updated 5 months ago
zhaochen0110 / OpenThinkIMG
View on GitHub
OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.
☆399Jun 1, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Jun-Kai-Zhang / rubrics
View on GitHub
The official code repo of paper "Chasing the Tail: Effective Rubric-based Reward Modeling for Large Language Model Post-Training"
☆30Feb 20, 2026Updated 5 months ago
inclusionAI / PromptCoT
View on GitHub
A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architect…
☆131Jan 31, 2026Updated 5 months ago
InternLM / POLAR
View on GitHub
Pre-trained, Scalable, High-performance Reward Models via Policy Discriminative Learning.
☆166Sep 23, 2025Updated 9 months ago
hiyouga / EasyR1
View on GitHub
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
☆5,071Updated this week
InternScience / MME-Reasoning
View on GitHub
Official Repository: A Comprehensive Benchmark for Logical Reasoning in MLLMs
☆45Jun 17, 2025Updated last year
OpenBMB / RLPR
View on GitHub
Extrapolating RLVR to General Domains without Verifiers
☆205Aug 12, 2025Updated 11 months ago
Hanpx20 / SafeSwitch
View on GitHub
Official code repository for the paper "Internal Activation as the Polar Star for Steering Unsafe LLM Behavior"
☆15May 31, 2026Updated last month