princeton-pli / what-makes-good-rmLinks

[NeurIPS 2025] What Makes a Reward Model a Good Teacher? An Optimization Perspective

☆38

Alternatives and similar repositories for what-makes-good-rm

Users that are interested in what-makes-good-rm are comparing it to the libraries listed below

Sorting:

bethgelab / sober-reasoning
A Sober Look at Language Model Reasoning
☆85Updated 2 weeks ago
holarissun / RewardModelingBeyondBradleyTerry
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…
☆66Updated 6 months ago
ryoungj / BoLT
Code for "Reasoning to Learn from Latent Thoughts"
☆121Updated 6 months ago
ZhentingWang / DUMP
☆30Updated 5 months ago
sail-sg / AnytimeReasoner
Optimizing Anytime Reasoning via Budget Relative Policy Optimization
☆47Updated 3 months ago
sail-sg / variational-reasoning
Code for "Variational Reasoning for Language Models"
☆49Updated 3 weeks ago
sail-sg / dice
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
☆44Updated 6 months ago
rdi-berkeley / awesome-RLVR-boundary
A curated list of resources on Reinforcement Learning with Verifiable Rewards (RLVR) and the reasoning capability boundary of Large Langu…
☆70Updated this week
facebookresearch / SPG
Code for paper "SPG Sandwiched Policy Gradient for Masked Diffusion Language Models"
☆24Updated last week
Optimization-AI / DisCO
Discriminative Constrained Optimization for Reinforcing Large Reasoning Models
☆38Updated last week
sail-sg / Attention-Sink
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)
☆131Updated 3 months ago
sail-sg / feedback-conditional-policy
Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"
☆47Updated 3 weeks ago
haotiansun14 / BBox-Adapter
Lightweight Adapting for Black-Box Large Language Models
☆23Updated last year
keven980716 / weak-to-strong-deception
[ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"
☆13Updated last year
formll / resolving-scaling-law-discrepancies
☆20Updated last year
RLHFlow / RAFT
This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or re…
☆37Updated last year
UCSB-NLP-Chang / ThinkPrune
☆44Updated last month
uservan / ThinkPO
☆17Updated 2 months ago
yunfeixie233 / ViGaL
☆60Updated last week
liziniu / GEM
Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large Language Models)
☆40Updated 5 months ago
ZHZisZZ / modpo
[ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization
☆91Updated last year
cassidylaidlaw / orpo
☆19Updated 11 months ago
Jiacheng-Zhu-AIML / AsymmetryLoRA
Preprint: Asymmetry in Low-Rank Adapters of Foundation Models
☆35Updated last year
sail-sg / VeriFree
Reinforcing General Reasoning without Verifiers
☆91Updated 4 months ago
Jiuzhouh / Uncertainty-Aware-Language-Agent
This is the official repo for Towards Uncertainty-Aware Language Agent.
☆29Updated last year
socialfoundations / tttlm
Test-time-training on nearest neighbors for large language models
☆46Updated last year
Model-GLUE / Model-GLUE
☆18Updated last year
hkust-nlp / mstar
[ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning
☆69Updated 3 months ago
junkangwu / beta-DPO
[NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$
☆49Updated last year
princeton-nlp / unintentional-unalignment
[ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
☆31Updated 9 months ago