genrm-star/genrm-critiques

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/genrm-star/genrm-critiques)

genrm-star / genrm-critiques

GenRM-CoT: Data release for verification rationales

☆68

Alternatives and similar repositories for genrm-critiques

Users that are interested in genrm-critiques are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

rookie-joe / AutoPSV
View on GitHub
☆50Oct 28, 2024Updated last year
cmu-mind / RISE
View on GitHub
☆34Oct 31, 2024Updated last year
RLHFlow / GVM
View on GitHub
☆16Jul 29, 2025Updated last year
Edward-Sun / easy-to-hard
View on GitHub
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
☆124Sep 9, 2024Updated last year
OFA-Sys / gsm8k-ScRel
View on GitHub
Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models
☆270Sep 12, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
snu-mllab / Bayesian-Red-Teaming
View on GitHub
About Official PyTorch implementation of "Query-Efficient Black-Box Red Teaming via Bayesian Optimization" (ACL'23)
☆15Jul 9, 2023Updated 3 years ago
allenai / reward-bench
View on GitHub
RewardBench: the first evaluation tool for reward models.
☆727Feb 16, 2026Updated 5 months ago
liujch1998 / ppo-mcts
View on GitHub
☆21Nov 13, 2023Updated 2 years ago
zankner / CLoud
View on GitHub
Critique-out-Loud Reward Models
☆76Oct 18, 2024Updated last year
iiis-ai / IterativeQuestionComposing
View on GitHub
[AAAI 2025] Augmenting Math Word Problems via Iterative Question Composing (https://arxiv.org/abs/2401.09003)
☆23Oct 2, 2025Updated 9 months ago
likenneth / q_probe
View on GitHub
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
☆40Jun 10, 2024Updated 2 years ago
hkust-nlp / dart-math
View on GitHub
[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
☆120Dec 10, 2024Updated last year
general-preference / general-preference-model
View on GitHub
[ICML 2025] Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment (https://arxiv.org/abs/2410.02197)
☆43Jun 15, 2026Updated last month
RLHFlow / RLHF-Reward-Modeling
View on GitHub
Recipes to train reward model for RLHF.
☆1,535Apr 24, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
henrykmichalewski / math-evals
View on GitHub
Math evaluations of llama models.
☆10Jan 3, 2024Updated 2 years ago
srush / awesome-o1
View on GitHub
A bibliography and survey of the papers surrounding o1
☆1,214Jul 7, 2026Updated 3 weeks ago
liziniu / cold_start_rl
View on GitHub
Code for Blog Post: Can Better Cold-Start Strategies Improve RL Training for LLMs?
☆20Mar 9, 2025Updated last year
StanfordMIMI / dspy-helm
View on GitHub
Structured Prompts Improve Evaluation of Language Models
☆15Jun 5, 2026Updated last month
duterscmy / CD-MoE
View on GitHub
Official PyTorch implementation of CD-MOE
☆12Mar 18, 2026Updated 4 months ago
MARIO-Math-Reasoning / Super_MARIO
View on GitHub
☆341Jun 5, 2025Updated last year
Freder-chen / ReasonGenRM
View on GitHub
A simple implementation of ReasonGenRM.
☆19Apr 21, 2025Updated last year
ablghtianyi / ICL_Modular_Arithmetic
View on GitHub
☆19Mar 25, 2025Updated last year
sanowl / Self-Correcting-LLM--Reinforcement-Learning-
View on GitHub
This my attempt to create Self-Correcting-LLM based on the paper Training Language Models to Self-Correct via Reinforcement Learning by g…
☆37Jul 9, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
WooooDyy / MathCritique
View on GitHub
Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".
☆55Nov 29, 2024Updated last year
mukhal / GRACE
View on GitHub
[EMNLP '23] Discriminator-Guided Chain-of-Thought Reasoning
☆50Oct 11, 2024Updated last year
yiqingxyq / RepoST
View on GitHub
Code for "[COLM'25] RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing"
☆24Mar 18, 2025Updated last year
KbsdJames / Omni-MATH
View on GitHub
The official repository of the Omni-MATH benchmark.
☆94Dec 22, 2024Updated last year
QwenLM / ProcessBench
View on GitHub
Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"
☆190May 20, 2025Updated last year
SWE-Gym / SWE-Bench-Fork
View on GitHub
☆13Mar 5, 2025Updated last year
THUDM / ReST-MCTS
View on GitHub
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)
☆709Jan 20, 2025Updated last year
PRIME-RL / ImplicitPRM
View on GitHub
Repo of paper "Free Process Rewards without Process Labels"
☆172Mar 14, 2025Updated last year
facebookresearch / cruxeval
View on GitHub
CRUXEval: Code Reasoning, Understanding, and Execution Evaluation
☆173Oct 11, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
AlphaPav / mem-kk-logic
View on GitHub
On Memorization of Large Language Models in Logical Reasoning
☆79Mar 29, 2025Updated last year
complex-reasoning / RPG
View on GitHub
[ICLR 2026] RPG: KL-Regularized Policy Gradient (https://arxiv.org/abs/2505.17508)
☆76Jun 29, 2026Updated last month
yale-nlp / ODSum
View on GitHub
Data and code for paper "ODSum: New Benchmarks for Open Domain Multi-Document Summarization"
☆11Sep 20, 2024Updated last year
TIGER-AI-Lab / AceCoder
View on GitHub
The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]
☆100Apr 9, 2025Updated last year
haozheji / exact-optimization
View on GitHub
ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment
☆55Jun 16, 2024Updated 2 years ago
dunzeng / MORE
View on GitHub
Code for EMNLP'24 paper - On Diversified Preferences of Large Language Model Alignment
☆16Aug 6, 2024Updated last year
huggingface / Math-Verify
View on GitHub
☆1,172Jan 10, 2026Updated 6 months ago