An index of algorithms for reinforcement learning from human feedback (rlhf))
☆92Apr 17, 2024Updated last year
Alternatives and similar repositories for awesome-rlhf
Users that are interested in awesome-rlhf are comparing it to the libraries listed below
Sorting:
- Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)☆29Dec 19, 2023Updated 2 years ago
- [ICML 2022] The official implementation of DWBC in "Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations"☆35Jan 5, 2023Updated 3 years ago
- Code and models for EMNLP 2024 paper "WPO: Enhancing RLHF with Weighted Preference Optimization"☆41Sep 24, 2024Updated last year
- Facebear's minimal implementation of SBAC (Soft behavior regularized actor critic, NIPS22 offline RL workshop)☆11Jul 4, 2022Updated 3 years ago
- A recipe for online RLHF and online iterative DPO.☆545Dec 28, 2024Updated last year
- ☆21Aug 30, 2025Updated 6 months ago
- A curated list of reinforcement learning with human feedback resources (continually updated)☆4,331Dec 9, 2025Updated 3 months ago
- OpenLLMDE: An open source data engineering framework for LLMs☆18Sep 9, 2023Updated 2 years ago
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Feb 22, 2024Updated 2 years ago
- The official implementation of Self-Exploring Language Models (SELM)☆63Jun 4, 2024Updated last year
- RewardBench: the first evaluation tool for reward models.☆704Feb 16, 2026Updated last month
- Repo of "Large Language Model-based Human-Agent Collaboration for Complex Task Solving(EMNLP2024 Findings)"☆34Sep 20, 2024Updated last year
- Deep Weighted Averaging Classifiers☆23Feb 4, 2019Updated 7 years ago
- Collection of papers and resources for data augmentation (DA) in visual reinforcement learning (RL).☆81Mar 27, 2024Updated last year
- Lipschitz Lifelong RL☆11Nov 6, 2020Updated 5 years ago
- [ICLR 2023 Oral] The official implementation of SQL and EQL in "Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Reg…☆46Jul 27, 2023Updated 2 years ago
- The server portion of the Neural Chat project to deploy chatbots on web. This code is accompanied by another repository that includes the…☆37Jun 10, 2021Updated 4 years ago
- TensorFlow implementation for our paper "Learning Long-Term Reward Redistribution via Randomized Return Decomposition"☆19Mar 17, 2022Updated 4 years ago
- Recipes to train reward model for RLHF.☆1,521Apr 24, 2025Updated 10 months ago
- ☆58Jun 13, 2024Updated last year
- ☆12Apr 29, 2024Updated last year
- ☆10May 17, 2024Updated last year
- Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback☆1,591Nov 24, 2025Updated 3 months ago
- ☆51Oct 28, 2024Updated last year
- Learning algorithm implementation and experiments in the paper "A Composable Specification Language for Reinforcement Learning Tasks" (ht…☆17Nov 23, 2020Updated 5 years ago
- CodeUltraFeedback: aligning large language models to coding preferences (TOSEM 2025)☆73Jun 25, 2024Updated last year
- Implementation of the Playground environment from the paper Language as a Cognitive Tool to Imagine Goals inCuriosity-Driven Exploration.☆11Mar 5, 2021Updated 5 years ago
- [ACL 2024] The project of Symbol-LLM☆58Jul 10, 2024Updated last year
- Official repo for EMNLP'24 paper "SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning"☆29Oct 1, 2024Updated last year
- StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback☆74Aug 31, 2024Updated last year
- A PyTorch implementation for the paper 'Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observatio…☆14Sep 22, 2021Updated 4 years ago
- Self-Supervised Alignment with Mutual Information☆20May 24, 2024Updated last year
- NAACL 2019 paper: Density Matching for Bilingual Word Embedding (Zhou et al., 2019)☆63Dec 8, 2022Updated 3 years ago
- Official implementation of ICLR'24 paper, "Curiosity-driven Red Teaming for Large Language Models" (https://openreview.net/pdf?id=4KqkizX…☆88Mar 15, 2024Updated 2 years ago
- This repository is the official implementation of Bidirectional Learning for Offline Infinite-width Model-based Optimization (NeurIPS 202…☆14Jan 19, 2023Updated 3 years ago
- A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…☆81Sep 28, 2023Updated 2 years ago
- Code and data for the paper "Understanding Hidden Context in Preference Learning: Consequences for RLHF"☆33Dec 14, 2023Updated 2 years ago
- The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism☆30Jul 17, 2024Updated last year
- LineArt, a framework that transfers complex appearance onto detailed design drawings, facilitating design and artistic creation.☆14Oct 2, 2025Updated 5 months ago