louieworth / awesome-rlhfLinks

An index of algorithms for reinforcement learning from human feedback (rlhf))

☆92

Alternatives and similar repositories for awesome-rlhf

Users that are interested in awesome-rlhf are comparing it to the libraries listed below

Sorting:

liziniu / ReMax
Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)
☆194Updated last year
ZHZisZZ / modpo
[ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization
☆91Updated last year
Vance0124 / Token-level-Direct-Preference-Optimization
Reference implementation for Token-level Direct Preference Optimization(TDPO)
☆148Updated 8 months ago
YangRui2015 / RiC
Code for the ICML 2024 paper "Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment"
☆77Updated 4 months ago
PKU-Alignment / aligner
[NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct
☆188Updated 9 months ago
PKU-Alignment / AlignmentSurvey
AI Alignment: A Comprehensive Survey
☆135Updated last year
YangRui2015 / Generalizable-Reward-Model
Code for NeurIPS 2024 paper "Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs"
☆41Updated 8 months ago
Magnetic2014 / llm-alignment-survey
A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…
☆81Updated 2 years ago
holarissun / RewardModelingBeyondBradleyTerry
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…
☆66Updated 6 months ago
PRIME-RL / Entropy-Mechanism-of-RL
The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.
☆360Updated 3 months ago
YifeiZhou02 / ArCHer
Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"
☆196Updated 6 months ago
Xuekai-Zhu / key-configuration-of-llms
☆23Updated last year
yubol-bobo / Awesome-Multi-Turn-LLMs
This is the official GitHub repository for our survey paper "Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language …
☆125Updated 5 months ago
liziniu / policy_optimization
Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)
☆28Updated last year
CJReinforce / PURE
Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"
☆138Updated 3 months ago
RyanLiu112 / Awesome-Process-Reward-Models
A comprehensive collection of process reward models.
☆115Updated 3 weeks ago
ValueCompass / Alignment-Goal-Survey
☆29Updated last year
junkangwu / beta-DPO
[NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$
☆49Updated last year
YuxiXie / MCTS-DPO
This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.
☆327Updated last year
CUHK-ARISE / GAMABench
Benchmarking LLMs' Gaming Ability in Multi-Agent Environments
☆88Updated 5 months ago
GAIR-NLP / LIMR
☆211Updated 8 months ago
OpenDFM / Rememberer
[NeurIPS 2023] Large Language Models Are Semi-Parametric Reinforcement Learning Agents
☆37Updated last year
mengdi-li / awesome-RLAIF
A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)
☆187Updated 2 months ago
yihedeng9 / rlhf-summary-notes
A brief and partial summary of RLHF algorithms.
☆132Updated 7 months ago
Linear95 / APO
Code for ACL2024 paper - Adversarial Preference Optimization (APO).
☆57Updated last year
ziyuwan / ReMA-public
Reinforced Multi-LLM Agents training
☆56Updated 4 months ago
0xallam / Direct-Preference-Optimization
Direct Preference Optimization from scratch in PyTorch
☆116Updated 6 months ago
thu-ml / Noise-Contrastive-Alignment
Code accompanying the paper "Noise Contrastive Alignment of Language Models with Explicit Rewards" (NeurIPS 2024)
☆57Updated 11 months ago
WooooDyy / MathCritique
Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".
☆56Updated 10 months ago
alexrame / rewardedsoups
Rewarded soups official implementation
☆60Updated 2 years ago