mengdi-li / awesome-RLAIF
A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)
☆137Updated last month
Related projects ⓘ
Alternatives and complementary repositories for awesome-RLAIF
- Paper collections of the continuous effort start from World Models.☆130Updated 4 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆96Updated last week
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.☆187Updated 3 months ago
- AI Alignment: A Comprehensive Survey☆128Updated last year
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆104Updated 4 months ago
- Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" pr…☆72Updated 9 months ago
- ☆257Updated 11 months ago
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆102Updated 7 months ago
- ☆85Updated 3 months ago
- ☆89Updated 4 months ago
- Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)☆150Updated 10 months ago
- ☆98Updated 5 months ago
- ☆112Updated 3 months ago
- ☆113Updated 3 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆95Updated 2 months ago
- AdaPlanner: Language Models for Decision Making via Adaptive Planning from Feedback☆92Updated last year
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)☆292Updated 3 weeks ago
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆52Updated 2 months ago
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆111Updated last week
- Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning☆199Updated last month
- Self-Alignment with Principle-Following Reward Models☆148Updated 8 months ago
- Reasoning with Language Model is Planning with World Model☆144Updated last year
- Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)☆163Updated this week
- ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings - NeurIPS 2023 (oral)☆233Updated 6 months ago
- RewardBench: the first evaluation tool for reward models.☆424Updated 2 weeks ago
- BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).☆111Updated last year
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆61Updated 3 weeks ago
- Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging☆96Updated last year
- An index of algorithms for reinforcement learning from human feedback (rlhf))☆87Updated 6 months ago
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆82Updated 8 months ago