mukhal/ThinkPRM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mukhal/ThinkPRM)

mukhal / ThinkPRM

[TMLR] Process Reward Models That Think

☆89

Alternatives and similar repositories for ThinkPRM

Users that are interested in ThinkPRM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

RyanLiu112 / Awesome-Process-Reward-Models
View on GitHub
A comprehensive collection of process reward models.
☆176Jun 6, 2026Updated last month
sail-sg / ActivePRM
View on GitHub
☆21Apr 16, 2025Updated last year
test-time-interaction / TTI
View on GitHub
☆76Jun 10, 2025Updated last year
bbartoldson / TBA
View on GitHub
Official implementation of TBA for async LLM post-training.
☆31Nov 5, 2025Updated 8 months ago
TIGER-AI-Lab / General-Reasoner
View on GitHub
General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]
☆227Nov 27, 2025Updated 7 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
sanjibanc / agent_prm
View on GitHub
☆60Feb 19, 2025Updated last year
RyanLiu112 / GenPRM
View on GitHub
[AAAI 2026] Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".
☆102Nov 8, 2025Updated 8 months ago
chentong0 / rl-binary-rar
View on GitHub
Official repo for "Binary Retrieval-augmented Reward Mitigates Hallucinations"
☆15Nov 13, 2025Updated 8 months ago
Kwai-Klear / RLEP
View on GitHub
RL with Experience Replay
☆59Jul 27, 2025Updated 11 months ago
ruixin31 / Spurious_Rewards
View on GitHub
☆361Jul 29, 2025Updated 11 months ago
sdiehl / prm
View on GitHub
Library for training process reward models
☆29Jun 3, 2025Updated last year
nick7nlp / FastCuRL
View on GitHub
FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient LLM Reasoning (EMNLP 2025)
☆61Oct 10, 2025Updated 9 months ago
sail-sg / variational-reasoning
View on GitHub
Code for "Variational Reasoning for Language Models"
☆60Sep 29, 2025Updated 9 months ago
Zhiyuan-Zeng / RLVE
View on GitHub
[ICML 2026] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
☆223Apr 30, 2026Updated 2 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
rosieyzh / openrlhf-pretrain
View on GitHub
Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"
☆29Oct 14, 2025Updated 9 months ago
xinzhel / LLM-Search
View on GitHub
Survey on LLM Inference via Search (TMLR 2025)
☆15May 6, 2025Updated last year
Interplay-LM-Reasoning / Interplay-LM-Reasoning
View on GitHub
[ICML 2026 Spotlight] On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models
☆162Jun 8, 2026Updated last month
complex-reasoning / RPG
View on GitHub
[ICLR 2026] RPG: KL-Regularized Policy Gradient (https://arxiv.org/abs/2505.17508)
☆76Jun 29, 2026Updated 3 weeks ago
portal-cornell / muCode
View on GitHub
☆33Oct 2, 2025Updated 9 months ago
knoveleng / open-rs
View on GitHub
[AAAI 2026] - Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"
☆291Mar 11, 2026Updated 4 months ago
PRIME-RL / TTRL
View on GitHub
[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning
☆1,100Apr 15, 2026Updated 3 months ago
tedmoskovitz / ConstrainedRL4LMs
View on GitHub
A library for constrained RLHF.
☆13Feb 19, 2024Updated 2 years ago
hbin0701 / Self-Explore
View on GitHub
[𝐄𝐌𝐍𝐋𝐏 𝐅𝐢𝐧𝐝𝐢𝐧𝐠𝐬 𝟐𝟎𝟐𝟒 & 𝐀𝐂𝐋 𝟐𝟎𝟐𝟒 𝐍𝐋𝐑𝐒𝐄 𝐎𝐫𝐚𝐥] 𝘌𝘯𝘩𝘢𝘯𝘤𝘪𝘯𝘨 𝘔𝘢𝘵𝘩𝘦𝘮𝘢𝘵𝘪𝘤𝘢𝘭 𝘙𝘦𝘢𝘴𝘰𝘯𝘪𝘯…
☆52May 4, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
CJReinforce / PURE
View on GitHub
Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"
☆172Oct 23, 2025Updated 8 months ago
QwenLM / ProcessBench
View on GitHub
Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"
☆189May 20, 2025Updated last year
ypwang61 / One-Shot-RLVR
View on GitHub
[NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example
☆444Mar 11, 2026Updated 4 months ago
lime-RL / DCPO
View on GitHub
DCPO: Dynamic Adaptive Clipping for RL
☆49Apr 1, 2026Updated 3 months ago
NVlabs / RLP
View on GitHub
[ICLR 2026] Official PyTorch Implementation of RLP: Reinforcement as a Pretraining Objective
☆252Jan 26, 2026Updated 5 months ago
tinnerhrhe / ROVER
View on GitHub
An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards
☆36Oct 3, 2025Updated 9 months ago
Jiuzhouh / Uncertainty-Aware-Language-Agent
View on GitHub
This is the official repo for Towards Uncertainty-Aware Language Agent.
☆31Aug 15, 2024Updated last year
rawsh / mirrorllm
View on GitHub
various experiments for scaling inference time compute with small reasoning models
☆17Jan 16, 2025Updated last year
RLsys-Foundation / APRIL
View on GitHub
APRIL: Active Partial Rollouts in Reinforcement Learning to Tame Long-tail Generation. A system-level optimization for scalable LLM tra…
☆60Oct 11, 2025Updated 9 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
debjitpaul / Causal_CoT
View on GitHub
About The corresponding code from our paper " Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning…
☆13Jan 14, 2026Updated 6 months ago
violetxi / ExpRL
View on GitHub
☆19Jun 16, 2026Updated last month
thu-wyz / inference_scaling
View on GitHub
☆80Nov 19, 2024Updated last year
SCIR-SC-Qiaoban-Team / FreeEvalLM
View on GitHub
[AAAI26] Trade-offs in Large Reasoning Models: An Empirical Analysis of Deliberative and Adaptive Reasoning over Foundational Capabilitie…
☆11Feb 7, 2026Updated 5 months ago
UCSB-NLP-Chang / ThinkPrune
View on GitHub
☆46Sep 27, 2025Updated 9 months ago
zitian-gao / SC-MCTS
View on GitHub
Interpretable Contrastive Monte Carlo Tree Search Reasoning
☆52Nov 9, 2024Updated last year
xiye17 / TextualExplInContext
View on GitHub
The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning (NeurIPS 2022)
☆16Feb 11, 2023Updated 3 years ago