mengdi-li / awesome-RLAIF
A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)
☆160Updated 2 months ago
Alternatives and similar repositories for awesome-RLAIF:
Users that are interested in awesome-RLAIF are comparing it to the libraries listed below
- AdaPlanner: Language Models for Decision Making via Adaptive Planning from Feedback☆107Updated 2 weeks ago
- Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)☆181Updated last year
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.☆301Updated 8 months ago
- ☆151Updated 3 weeks ago
- Paper collections of the continuous effort start from World Models.☆170Updated 9 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆133Updated 5 months ago
- ☆96Updated 9 months ago
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆169Updated 3 months ago
- AI Alignment: A Comprehensive Survey☆133Updated last year
- Reasoning with Language Model is Planning with World Model☆163Updated last year
- ☆106Updated 2 months ago
- ☆126Updated 9 months ago
- A brief and partial summary of RLHF algorithms.☆127Updated last month
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆136Updated 2 months ago
- augmented LLM with self reflection☆118Updated last year
- ☆91Updated last month
- RewardBench: the first evaluation tool for reward models.☆553Updated last month
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied wit…☆122Updated 9 months ago
- [NeurIPS 2023] We use large language models as commonsense world model and heuristic policy within Monte-Carlo Tree Search, enabling bett…☆265Updated 5 months ago
- ☆274Updated 3 months ago
- ☆137Updated 4 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆133Updated 4 months ago
- An Analytical Evaluation Board of Multi-turn LLM Agents [NeurIPS 2024 Oral]☆303Updated 10 months ago
- ☆118Updated 10 months ago
- Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning☆336Updated 4 months ago
- [NeurIPS 2024] Agent Planning with World Knowledge Model☆124Updated 4 months ago
- A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…☆78Updated last year
- Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" pr…☆98Updated last year
- An index of algorithms for reinforcement learning from human feedback (rlhf))☆93Updated last year
- Self-Alignment with Principle-Following Reward Models☆159Updated last year