PKU-Alignment / AlignmentSurveyLinks
AI Alignment: A Comprehensive Survey
☆135Updated last year
Alternatives and similar repositories for AlignmentSurvey
Users that are interested in AlignmentSurvey are comparing it to the libraries listed below
Sorting:
- Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)☆185Updated last year
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied wit…☆131Updated 11 months ago
- A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…☆81Updated last year
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆177Updated 5 months ago
- A Comprehensive Survey on Long Context Language Modeling☆152Updated 3 weeks ago
- The related works and background techniques about Openai o1☆222Updated 5 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆264Updated 9 months ago
- ☆145Updated 5 months ago
- An index of algorithms for reinforcement learning from human feedback (rlhf))☆92Updated last year
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆141Updated 4 months ago
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.☆191Updated last week
- ☆220Updated last month
- Feeling confused about super alignment? Here is a reading list☆42Updated last year
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.☆318Updated 10 months ago
- A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)☆171Updated 5 months ago
- Project for the paper entitled `Instruction Tuning for Large Language Models: A Survey`☆179Updated 6 months ago
- Recent advancements propelled by large language models (LLMs), encompassing an array of domains including Vision, Audio, Agent, Robotics,…☆122Updated 3 weeks ago
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆186Updated 3 months ago
- Fantastic Data Engineering for Large Language Models☆89Updated 5 months ago
- Code for ACL2024 paper - Adversarial Preference Optimization (APO).☆54Updated last year
- All about large language models☆51Updated last year
- A comprehensive collection of process reward models.☆92Updated 2 weeks ago
- ☆63Updated 7 months ago
- papers related to LLM-agent that published on top conferences☆315Updated 2 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆144Updated 7 months ago
- A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or l…☆282Updated last year
- ☆109Updated 3 months ago
- ☆33Updated 9 months ago
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆54Updated 6 months ago
- Repository for Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning☆164Updated last year