PKU-Alignment / AlignmentSurvey
AI Alignment: A Comprehensive Survey
☆133Updated last year
Alternatives and similar repositories for AlignmentSurvey:
Users that are interested in AlignmentSurvey are comparing it to the libraries listed below
- Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)☆179Updated last year
- A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…☆77Updated last year
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied wit…☆121Updated 8 months ago
- The related works and background techniques about Openai o1☆217Updated 2 months ago
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆167Updated 2 months ago
- An index of algorithms for reinforcement learning from human feedback (rlhf))☆93Updated 11 months ago
- Feeling confused about super alignment? Here is a reading list☆42Updated last year
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆130Updated last month
- A Comprehensive Survey on Long Context Language Modeling☆113Updated last week
- ☆85Updated 3 weeks ago
- ☆54Updated 5 months ago
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆161Updated 2 weeks ago
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆52Updated 4 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆252Updated 6 months ago
- Fantastic Data Engineering for Large Language Models☆85Updated 3 months ago
- ☆115Updated 2 months ago
- ☆216Updated this week
- A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)☆158Updated 2 months ago
- SOTA RL fine-tuning solution for advanced math reasoning of LLM☆92Updated this week
- On Memorization of Large Language Models in Logical Reasoning☆60Updated this week
- A research repo for experiments about Reinforcement Finetuning☆37Updated 2 weeks ago
- ☆171Updated last month
- ☆22Updated 8 months ago
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.☆301Updated 7 months ago
- ☆129Updated last week
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆132Updated 5 months ago
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆125Updated 3 months ago
- ☆59Updated last week
- Paper List for a new paradigm of NLP: Interactive NLP (https://arxiv.org/abs/2305.13246)☆214Updated last year
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆58Updated 3 months ago