PKU-Alignment / safe-rlhf
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
☆1,412Updated 8 months ago
Alternatives and similar repositories for safe-rlhf:
Users that are interested in safe-rlhf are comparing it to the libraries listed below
- Secrets of RLHF in Large Language Models Part I: PPO☆1,316Updated 11 months ago
- ☆889Updated 6 months ago
- [NIPS2023] RRHF & Wombat☆799Updated last year
- Open Academic Research on Improving LLaMA to SOTA LLM☆1,618Updated last year
- A plug-and-play library for parameter-efficient-tuning (Delta Tuning)☆1,011Updated 4 months ago
- [NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward☆814Updated 3 months ago
- Reference implementation for DPO (Direct Preference Optimization)☆2,367Updated 6 months ago
- ☆468Updated last month
- Aligning Large Language Models with Human: A Survey☆715Updated last year
- ☆904Updated 8 months ago
- A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or l…☆276Updated last year
- Recipes to train reward model for RLHF.☆1,160Updated last week
- A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.☆793Updated 7 months ago
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)☆565Updated 3 weeks ago
- Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large …☆977Updated 2 months ago
- [ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning☆411Updated 3 months ago
- Collaborative Training of Large Language Models in an Efficient Way☆411Updated 5 months ago
- A modular RL library to fine-tune language models to human preferences☆2,270Updated 11 months ago
- Code for the paper Fine-Tuning Language Models from Human Preferences☆1,276Updated last year
- ⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)☆916Updated 2 months ago
- Paper List for In-context Learning 🌷☆835Updated 4 months ago
- Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"☆1,126Updated 11 months ago
- [ACL 2024] A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future☆407Updated 3 weeks ago
- ☆456Updated 8 months ago
- We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tunin…☆2,682Updated last year
- papers related to LLM-agent that published on top conferences☆311Updated last year
- Best practice for training LLaMA models in Megatron-LM☆642Updated last year
- Tuning LLMs with no tears💦; Sample Design Engineering (SDE) for more efficient downstream-tuning.☆984Updated 9 months ago
- ☆318Updated 7 months ago
- 开源SFT数据集整理,随时补充☆481Updated last year