RyanLiu112 / GenPRMLinks
Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".
☆73Updated last month
Alternatives and similar repositories for GenPRM
Users that are interested in GenPRM are comparing it to the libraries listed below
Sorting:
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆54Updated 6 months ago
- A comprehensive collection of process reward models.☆85Updated last week
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆93Updated this week
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆64Updated last month
- ☆60Updated last week
- ☆198Updated last week
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆205Updated this week
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆106Updated last month
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆120Updated this week
- This my attempt to create Self-Correcting-LLM based on the paper Training Language Models to Self-Correct via Reinforcement Learning by g…☆35Updated last month
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆59Updated 5 months ago
- ☆50Updated this week
- ☆145Updated last week
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆98Updated 3 weeks ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆121Updated 2 months ago
- The official code repository for PRMBench.☆73Updated 3 months ago
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆93Updated 3 months ago
- ☆45Updated last month
- A research repo for experiments about Reinforcement Finetuning☆47Updated last month
- The official code of paper “Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning”☆99Updated this week
- The official repository of the Omni-MATH benchmark.☆83Updated 5 months ago
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".☆94Updated 2 months ago
- ☆18Updated last month
- A Comprehensive Survey on Long Context Language Modeling☆147Updated 2 weeks ago
- ☆173Updated this week
- The implementation of paper "LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Fee…☆39Updated 10 months ago
- ☆64Updated last month
- [AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy☆61Updated 5 months ago
- This repository collects research papers on learning from rewards in the context of post-training and test-time scaling of large language…☆37Updated 3 weeks ago
- A Framework for LLM-based Multi-Agent Reinforced Training and Inference☆61Updated this week