PKU-Alignment / AlignmentSurvey
AI Alignment: A Comprehensive Survey
☆133Updated last year
Alternatives and similar repositories for AlignmentSurvey
Users that are interested in AlignmentSurvey are comparing it to the libraries listed below
Sorting:
- A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…☆80Updated last year
- An index of algorithms for reinforcement learning from human feedback (rlhf))☆92Updated last year
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied wit…☆127Updated 10 months ago
- Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)☆181Updated last year
- Fantastic Data Engineering for Large Language Models☆87Updated 4 months ago
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆171Updated 4 months ago
- Feeling confused about super alignment? Here is a reading list☆42Updated last year
- On Memorization of Large Language Models in Logical Reasoning☆64Updated last month
- A Comprehensive Survey on Long Context Language Modeling☆142Updated last month
- ☆168Updated last month
- The related works and background techniques about Openai o1☆221Updated 4 months ago
- A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)☆165Updated 3 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆261Updated 8 months ago
- [ICML'2024] Can AI Assistants Know What They Don't Know?☆80Updated last year
- [ICML 2024] Selecting High-Quality Data for Training Language Models☆171Updated 10 months ago
- A research repo for experiments about Reinforcement Finetuning☆46Updated last month
- ☆138Updated 5 months ago
- Collection of papers for scalable automated alignment.☆89Updated 6 months ago
- Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…☆78Updated this week
- ☆31Updated 8 months ago
- A comprehensive collection of process reward models.☆76Updated last week
- Paper List for a new paradigm of NLP: Interactive NLP (https://arxiv.org/abs/2305.13246)☆214Updated last year
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆119Updated 6 months ago
- A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or l…☆280Updated last year
- ☆97Updated 2 months ago
- ☆102Updated 5 months ago
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆77Updated 4 months ago
- A curated list of awesome resources dedicated to Scaling Laws for LLMs☆71Updated 2 years ago
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆138Updated 3 months ago
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆173Updated last week