RyanLiu112/GenPRM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/RyanLiu112/GenPRM)

RyanLiu112 / GenPRM

[AAAI 2026] Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".

☆102

Alternatives and similar repositories for GenPRM

Users that are interested in GenPRM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

RyanLiu112 / Awesome-Process-Reward-Models
View on GitHub
A comprehensive collection of process reward models.
☆176Jun 6, 2026Updated last month
mukhal / ThinkPRM
View on GitHub
[TMLR] Process Reward Models That Think
☆89Nov 29, 2025Updated 7 months ago
NuoJohnChen / JudgeLRM
View on GitHub
JudgeLRM: Large Reasoning Models as a Judge
☆42May 6, 2026Updated 2 months ago
QwenLM / ProcessBench
View on GitHub
Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"
☆189May 20, 2025Updated last year
PRIME-RL / PRIME
View on GitHub
Scalable RL solution for advanced reasoning of language models
☆1,865Mar 18, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
RyanLiu112 / compute-optimal-tts
View on GitHub
Official codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".
☆288Feb 19, 2025Updated last year
GAIR-NLP / ToRL
View on GitHub
☆352May 24, 2025Updated last year
Simplified-Reasoning / LUFFY
View on GitHub
Official Repository of "Learning to Reason under Off-Policy Guidance"
☆459Mar 20, 2026Updated 4 months ago
xufangzhi / Genius
View on GitHub
[ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework
☆72Jun 1, 2025Updated last year
ChenxinAn-fdu / POLARIS
View on GitHub
Scaling RL on advanced reasoning models
☆691Oct 20, 2025Updated 9 months ago
ShadeCloak / ADORA
View on GitHub
☆47Apr 9, 2025Updated last year
Open-Reasoner-Zero / Open-Reasoner-Zero
View on GitHub
Official Repo for Open-Reasoner-Zero
☆2,096Jun 2, 2025Updated last year
RLHFlow / RLHF-Reward-Modeling
View on GitHub
Recipes to train reward model for RLHF.
☆1,534Apr 24, 2025Updated last year
RLHFlow / Self-rewarding-reasoning-LLM
View on GitHub
Recipes to train the self-rewarding reasoning LLMs.
☆231Mar 2, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
UW-Madison-Lee-Lab / VersaPRM
View on GitHub
☆37Feb 11, 2025Updated last year
LAMDA-NeSy / Self-Backtracking
View on GitHub
☆52Feb 12, 2025Updated last year
shengliu66 / FractionalReason
View on GitHub
Official github repo for "Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute"
☆17Jun 30, 2025Updated last year
lblankl / Short-RL
View on GitHub
Short RL
☆19Apr 16, 2026Updated 3 months ago
LINs-lab / LIE
View on GitHub
[preprint] Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning
☆19Feb 18, 2026Updated 5 months ago
ssmisya / PRMBench
View on GitHub
[ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.
☆93Feb 15, 2025Updated last year
PRIME-RL / TTRL
View on GitHub
[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning
☆1,100Apr 15, 2026Updated 3 months ago
nick7nlp / FastCuRL
View on GitHub
FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient LLM Reasoning (EMNLP 2025)
☆61Oct 10, 2025Updated 9 months ago
ByteDance-Seed / Seed-Thinking-v1.5
View on GitHub
☆810Jun 9, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
lzhxmu / CPPO
View on GitHub
CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models (NeurIPS 2025)
☆181Nov 4, 2025Updated 8 months ago
Zhiyuan-Zeng / RLVE
View on GitHub
[ICML 2026] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
☆225Apr 30, 2026Updated 2 months ago
OpenBMB / RLPR
View on GitHub
Extrapolating RLVR to General Domains without Verifiers
☆205Aug 12, 2025Updated 11 months ago
MasterVito / SwS
View on GitHub
Official Repo for SwS: A Weakness-driven Problem Synthesis Framework in RL for LLM Reasoning
☆42Nov 11, 2025Updated 8 months ago
Qihoo360 / Light-R1
View on GitHub
☆765Dec 23, 2025Updated 6 months ago
IcyFish332 / T3RL
View on GitHub
☆48Apr 15, 2026Updated 3 months ago
Interplay-LM-Reasoning / Interplay-LM-Reasoning
View on GitHub
[ICML 2026 Spotlight] On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models
☆162Jun 8, 2026Updated last month
ypwang61 / One-Shot-RLVR
View on GitHub
[NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example
☆444Mar 11, 2026Updated 4 months ago
Gen-Verse / GenEnv
View on GitHub
GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators
☆62Dec 23, 2025Updated 6 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
sail-sg / ActivePRM
View on GitHub
☆21Apr 16, 2025Updated last year
AMAP-ML / GPG
View on GitHub
[ICLR26]GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning
☆179Jan 29, 2026Updated 5 months ago
jwhj / OREO
View on GitHub
☆116Jan 21, 2025Updated last year
THU-KEG / RM-Bench
View on GitHub
[ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
☆84Jul 18, 2025Updated last year
Yifan-Song793 / ETO
View on GitHub
Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)
☆168Oct 30, 2024Updated last year
sanjibanc / agent_prm
View on GitHub
☆60Feb 19, 2025Updated last year
ltzheng / SimpleTIR
View on GitHub
[ICLR 2026] End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
☆401Mar 30, 2026Updated 3 months ago