Train transformer language models with reinforcement learning.
β17,460Feb 26, 2026Updated this week
Alternatives and similar repositories for trl
Users that are interested in trl are comparing it to the libraries listed below
Sorting:
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β20,678Updated this week
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)β4,741Jan 8, 2024Updated 2 years ago
- An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)β9,037Feb 21, 2026Updated last week
- verl: Volcano Engine Reinforcement Learning for LLMsβ19,339Updated this week
- Fast and memory-efficient exact attentionβ22,361Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ71,234Updated this week
- A modular RL library to fine-tune language models to human preferencesβ2,378Mar 1, 2024Updated 2 years ago
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β41,706Updated this week
- Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)β67,659Updated this week
- Fully open reproduction of DeepSeek-R1β25,910Nov 24, 2025Updated 3 months ago
- Robust recipes to align language models with human and AI preferencesβ5,506Sep 8, 2025Updated 5 months ago
- A framework for few-shot evaluation of language models.β11,478Feb 15, 2026Updated 2 weeks ago
- Ongoing research training transformer models at scaleβ15,461Updated this week
- Large Language Model Text Generation Inferenceβ10,788Jan 8, 2026Updated last month
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,414Jun 2, 2025Updated 9 months ago
- SGLang is a high-performance serving framework for large language models and multimodal models.β23,905Updated this week
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.β24,478Aug 12, 2024Updated last year
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,513Updated this week
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,033Jan 23, 2026Updated last month
- Code and documentation to train Stanford's Alpaca models, and generate the data.β30,271Jul 17, 2024Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,843Jun 10, 2024Updated last year
- Accessible large language models via k-bit quantization for PyTorch.β7,997Updated this week
- Fine-tuning & Reinforcement Learning for LLMs. π¦₯ Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.β52,724Updated this week
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β157,071Updated this week
- Example models using DeepSpeedβ6,791Feb 7, 2026Updated 3 weeks ago
- Inference code for Llama modelsβ59,166Jan 26, 2025Updated last year
- Tools for merging pretrained large language models.β6,814Jan 26, 2026Updated last month
- Reference implementation for DPO (Direct Preference Optimization)β2,855Aug 11, 2024Updated last year
- Instruct-tune LLaMA on consumer hardwareβ18,972Jul 29, 2024Updated last year
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We alsβ¦β18,220Nov 3, 2025Updated 3 months ago
- Making large AI models cheaper, faster and more accessibleβ41,359Feb 23, 2026Updated last week
- A curated list of reinforcement learning with human feedback resources (continually updated)β4,306Dec 9, 2025Updated 2 months ago
- Go ahead and axolotl questionsβ11,335Updated this week
- Aligning pretrained language models with instruction data generated by themselves.β4,580Mar 27, 2023Updated 2 years ago
- [ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parametersβ5,936Mar 14, 2024Updated last year
- Minimal reproduction of DeepSeek R1-Zeroβ12,853Updated this week
- LlamaIndex is the leading document agent and OCR platformβ47,210Updated this week
- Simple RL training for reasoningβ3,830Dec 23, 2025Updated 2 months ago
- AllenAI's post-training codebaseβ3,592Updated this week