Collection of links, tutorials and best practices of how to collect the data and build end-to-end RLHF system to finetune Generative AI models
☆227Jul 24, 2023Updated 2 years ago
Alternatives and similar repositories for RLHF
Users that are interested in RLHF are comparing it to the libraries listed below
Sorting:
- ☆15May 27, 2025Updated 9 months ago
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)☆4,739Jan 8, 2024Updated 2 years ago
- A modular RL library to fine-tune language models to human preferences☆2,382Mar 1, 2024Updated 2 years ago
- Natural Language to Code☆14May 2, 2021Updated 4 years ago
- ☆29Apr 29, 2024Updated last year
- AI_Powered_Dev_Search_Engine☆12Mar 10, 2024Updated 2 years ago
- RLHF implementation details of OAI's 2019 codebase☆197Jan 14, 2024Updated 2 years ago
- A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Huma…☆140Apr 28, 2023Updated 2 years ago
- 💙 Unstructured Data Connectors for Haystack 2.0☆17Sep 21, 2023Updated 2 years ago
- Perform visual question answering on your images☆19May 8, 2024Updated last year
- Train transformer language models with reinforcement learning.☆17,697Updated this week
- Code for the paper Fine-Tuning Language Models from Human Preferences☆1,381Jul 25, 2023Updated 2 years ago
- A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.☆842Jul 1, 2024Updated last year
- ACL 2022: Just Rank: Rethinking Evaluation with Word and Sentence Similarities☆35Dec 14, 2022Updated 3 years ago
- Tutorial on probabilistic classification and cost-sensitive learning.☆13Aug 19, 2025Updated 7 months ago
- Slides and notebook for the workshop on serving bert models in production☆25Nov 12, 2022Updated 3 years ago
- Robust recipes to align language models with human and AI preferences☆5,527Sep 8, 2025Updated 6 months ago
- This is the code for our paper: PLACES: Prompting Language Models for Social Conversation Synthesis☆11Feb 17, 2023Updated 3 years ago
- ☆17Dec 31, 2023Updated 2 years ago
- EasyRLHF aims to provide an easy and minimal interface to train aligned language models, using off-the-shelf solutions and datasets☆10Dec 12, 2023Updated 2 years ago
- This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity☆48Jan 19, 2024Updated 2 years ago
- Code for "What really matters in matrix-whitening optimizers?"☆23Oct 31, 2025Updated 4 months ago
- A Python library for creating adversarial splits