CarperAI/trlx

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CarperAI/trlx)

CarperAI / trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

☆4,753

Alternatives and similar repositories for trlx

Users that are interested in trlx are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

allenai / RL4LMs
View on GitHub
A modular RL library to fine-tune language models to human preferences
☆2,393Mar 1, 2024Updated 2 years ago
huggingface / trl
View on GitHub
Train transformer language models with reinforcement learning.
☆18,913Updated this week
lucidrains / PaLM-rlhf-pytorch
View on GitHub
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
☆7,867May 29, 2026Updated last month
CarperAI / cheese
View on GitHub
Used for adaptive human in the loop evaluation of language and embedding models.
☆306Mar 1, 2023Updated 3 years ago
openai / lm-human-preferences
View on GitHub
Code for the paper Fine-Tuning Language Models from Human Preferences
☆1,393Jul 25, 2023Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
anthropics / hh-rlhf
View on GitHub
Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"
☆1,852Jun 17, 2025Updated last year
huggingface / peft
View on GitHub
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
☆21,441Updated this week
deepspeedai / DeepSpeedExamples
View on GitHub
Example models using DeepSpeed
☆6,828Updated this week
GanjinZero / RRHF
View on GitHub
[NIPS2023] RRHF & Wombat
☆805Sep 22, 2023Updated 2 years ago
opendilab / awesome-RLHF
View on GitHub
A curated list of reinforcement learning with human feedback resources (continually updated)
☆4,416May 20, 2026Updated 2 months ago
huggingface / alignment-handbook
View on GitHub
Robust recipes to align language models with human and AI preferences
☆5,643May 26, 2026Updated last month
OpenLMLab / MOSS-RLHF
View on GitHub
Secrets of RLHF in Large Language Models Part I: PPO
☆1,426Mar 3, 2024Updated 2 years ago
eric-mitchell / direct-preference-optimization
View on GitHub
Reference implementation for DPO (Direct Preference Optimization)
☆2,898Aug 11, 2024Updated last year
tatsu-lab / alpaca_farm
View on GitHub
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
☆845Jul 1, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
yizhongw / self-instruct
View on GitHub
Aligning pretrained language models with instruction data generated by themselves.
☆4,606Mar 27, 2023Updated 3 years ago
tatsu-lab / stanford_alpaca
View on GitHub
Code and documentation to train Stanford's Alpaca models, and generate the data.
☆30,250Jul 17, 2024Updated 2 years ago
NVIDIA / Megatron-LM
View on GitHub
Ongoing research training transformer models at scale
☆17,181Updated this week
Instruction-Tuning-with-GPT-4 / GPT-4-LLM
View on GitHub
Instruction Tuning with GPT-4
☆4,332Jun 11, 2023Updated 3 years ago
tloen / alpaca-lora
View on GitHub
Instruct-tune LLaMA on consumer hardware
☆18,912Jul 29, 2024Updated last year
deepspeedai / DeepSpeed
View on GitHub
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
☆42,787Updated this week
PKU-Alignment / safe-rlhf
View on GitHub
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
☆1,611Nov 24, 2025Updated 8 months ago
EleutherAI / gpt-neox
View on GitHub
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
☆7,446Jun 11, 2026Updated last month
openai / summarize-from-feedback
View on GitHub
Code for "Learning to summarize from human feedback"
☆1,062Sep 5, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
hpcaitech / ColossalAI
View on GitHub
Making large AI models cheaper, faster and more accessible
☆41,420Jul 13, 2026Updated last week
OpenRLHF / OpenRLHF
View on GitHub
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Asy…
☆9,841Jul 14, 2026Updated last week
lm-sys / FastChat
View on GitHub
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
☆39,500May 1, 2026Updated 2 months ago
LAION-AI / Open-Assistant
View on GitHub
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamical…
☆37,379Aug 17, 2024Updated last year
Dao-AILab / flash-attention
View on GitHub
Fast and memory-efficient exact attention
☆24,519Updated this week
bigscience-workshop / Megatron-DeepSpeed
View on GitHub
Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆1,448Mar 20, 2024Updated 2 years ago
FMInference / FlexLLMGen
View on GitHub
Running large language models on a single GPU for throughput-oriented scenarios.
☆9,363Oct 28, 2024Updated last year
microsoft / unilm
View on GitHub
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
☆22,168Jan 23, 2026Updated 6 months ago
facebookresearch / metaseq
View on GitHub
Repo for external large-scale work
☆6,550Apr 27, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
OpenGVLab / LLaMA-Adapter
View on GitHub
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
☆5,916Mar 14, 2024Updated 2 years ago
EleutherAI / lm-evaluation-harness
View on GitHub
A framework for few-shot evaluation of language models.
☆13,390Jul 13, 2026Updated last week
microsoft / LMOps
View on GitHub
General technology for enabling AI capabilities w/ LLMs and MLLMs
☆4,443Updated this week
openai / prm800k
View on GitHub
800,000 step-level correctness labels on LLM solutions to MATH problems
☆2,151Jun 1, 2023Updated 3 years ago
huggingface / text-generation-inference
View on GitHub
Large Language Model Text Generation Inference
☆10,882Mar 21, 2026Updated 4 months ago
mosaicml / llm-foundry
View on GitHub
LLM training code for Databricks foundation models
☆4,432Mar 25, 2026Updated 3 months ago
LianjiaTech / BELLE
View on GitHub
BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）
☆8,276Oct 16, 2024Updated last year