A comprehensive collection of process reward models.
☆138Oct 4, 2025Updated 5 months ago
Alternatives and similar repositories for Awesome-Process-Reward-Models
Users that are interested in Awesome-Process-Reward-Models are comparing it to the libraries listed below
Sorting:
- [AAAI 2026] Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆94Nov 8, 2025Updated 3 months ago
- Process Reward Models That Think☆80Nov 29, 2025Updated 3 months ago
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Feb 22, 2024Updated 2 years ago
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"☆184May 20, 2025Updated 9 months ago
- Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)☆28Dec 19, 2023Updated 2 years ago
- JudgeLRM: Large Reasoning Models as a Judge☆41Jan 29, 2026Updated last month
- Official codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".☆283Feb 19, 2025Updated last year
- ☆20May 16, 2024Updated last year
- ☆34Feb 11, 2025Updated last year
- ☆14Jan 24, 2025Updated last year
- Unofficial implementation of Chain of Hindsight (https://arxiv.org/abs/2302.02676) using pytorch and huggingface Trainers.☆11Apr 5, 2023Updated 2 years ago
- ☆342Jun 5, 2025Updated 9 months ago
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆417Oct 4, 2025Updated 5 months ago
- Text-to-Speech Synthesis by Generating Spectrograms using Generative Adversarial Network☆10Dec 12, 2018Updated 7 years ago
- ☆79Nov 19, 2024Updated last year
- Official repository for paper "DeepCritic: Deliberate Critique with Large Language Models"☆41Jun 24, 2025Updated 8 months ago
- [ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.☆88Feb 15, 2025Updated last year
- ☆51Oct 28, 2024Updated last year
- Code for NAACL 2025 paper "AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge"☆16Updated this week
- ☆17Mar 26, 2021Updated 4 years ago
- (ACL 2025) 🔥🔥🔥Code for "Empowering Multimodal Large Language Models with Evol-Instruct"☆20May 15, 2025Updated 9 months ago
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆86May 21, 2025Updated 9 months ago
- ☆38Nov 13, 2025Updated 3 months ago
- Official code implementation for the ACL 2025 paper: 'CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis'☆32May 19, 2025Updated 9 months ago
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆158Oct 23, 2025Updated 4 months ago
- [NeurIPS 2025 Spotlight] LLM post-training suite for long-CoT reasoning, PRM, and code generation — featuring ReasonFlux, ReasonFlux-PRM,…☆521Sep 27, 2025Updated 5 months ago
- Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision☆18Apr 1, 2025Updated 11 months ago
- Official Code Repository for [AutoScale📈: Scale-Aware Data Mixing for Pre-Training LLMs] Published as a conference paper at **COLM 2025*…☆13Aug 8, 2025Updated 6 months ago
- ☆157May 28, 2025Updated 9 months ago
- Official Repo for Open-Reasoner-Zero☆2,087Jun 2, 2025Updated 9 months ago
- Scalable RL solution for advanced reasoning of language models☆1,811Mar 18, 2025Updated 11 months ago
- Collection of latest papers and materials in the area of RLVR!☆65Updated this week
- Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"☆392Jan 19, 2025Updated last year
- A Survey of Reinforcement Learning for Large Reasoning Models☆2,351Nov 9, 2025Updated 3 months ago
- Towards a Unified View of Large Language Model Post-Training☆204Sep 8, 2025Updated 5 months ago
- 😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond☆347Jan 22, 2026Updated last month
- ☆497Oct 11, 2025Updated 4 months ago
- ☆72Apr 2, 2024Updated last year
- From Hypothesis to Publication: A Comprehensive Survey of AI-Driven Research Support Systems☆17Nov 23, 2025Updated 3 months ago