allenai / RL4LMs
A modular RL library to fine-tune language models to human preferences
☆2,292Updated last year
Alternatives and similar repositories for RL4LMs:
Users that are interested in RL4LMs are comparing it to the libraries listed below
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)☆4,603Updated last year
- Code for the paper Fine-Tuning Language Models from Human Preferences☆1,300Updated last year
- ☆1,507Updated this week
- Reference implementation for DPO (Direct Preference Optimization)☆2,465Updated 7 months ago
- Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"☆1,707Updated last year
- Reading list of Instruction-tuning. A trend starts from Natrural-Instruction (ACL 2022), FLAN (ICLR 2022) and T0 (ICLR 2022).☆766Updated last year
- Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"☆1,141Updated last year
- A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.☆801Updated 8 months ago
- Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-…☆555Updated 10 months ago
- Expanding natural instructions☆983Updated last year
- Secrets of RLHF in Large Language Models Part I: PPO☆1,335Updated last year
- [NIPS2023] RRHF & Wombat☆804Updated last year
- Benchmarking large language models' complex reasoning ability with chain-of-thought prompting☆2,697Updated 7 months ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆1,378Updated last year
- Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models☆2,997Updated 8 months ago
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆2,463Updated 7 months ago
- Aligning pretrained language models with instruction data generated by themselves.☆4,314Updated last year
- General technology for enabling AI capabilities w/ LLMs and MLLMs☆3,896Updated last week
- Toolkit for creating, sharing and using natural language prompts.☆2,803Updated last year
- Aligning Large Language Models with Human: A Survey☆726Updated last year
- A Unified Library for Parameter-Efficient and Modular Transfer Learning☆2,669Updated 2 weeks ago
- Paper List for In-context Learning 🌷☆849Updated 5 months ago
- The hub for EleutherAI's work on interpretability and learning dynamics☆2,423Updated last week
- A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)☆1,113Updated last year
- Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback☆1,431Updated 9 months ago
- LOMO: LOw-Memory Optimization☆981Updated 8 months ago
- Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning☆719Updated last year
- A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).☆817Updated 2 weeks ago
- Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09…☆2,131Updated this week
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆2,022Updated 3 weeks ago