Train transformer language models with reinforcement learning.
β17,523Mar 5, 2026Updated this week
Alternatives and similar repositories for trl
Users that are interested in trl are comparing it to the libraries listed below
Sorting:
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β20,717Updated this week
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)β4,738Jan 8, 2024Updated 2 years ago
- An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)β9,084Updated this week
- verl: Volcano Engine Reinforcement Learning for LLMsβ19,519Updated this week
- Fast and memory-efficient exact attentionβ22,460Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ71,883Updated this week
- A modular RL library to fine-tune language models to human preferencesβ2,380Mar 1, 2024Updated 2 years ago
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β41,706Feb 27, 2026Updated last week
- Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)β67,659Feb 27, 2026Updated last week
- Fully open reproduction of DeepSeek-R1β25,910Nov 24, 2025Updated 3 months ago
- Robust recipes to align language models with human and AI preferencesβ5,510Sep 8, 2025Updated 5 months ago
- A framework for few-shot evaluation of language models.β11,540Updated this week
- Ongoing research training transformer models at scaleβ15,461Updated this week
- Large Language Model Text Generation Inferenceβ10,788Jan 8, 2026Updated last month
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,426Jun 2, 2025Updated 9 months ago
- SGLang is a high-performance serving framework for large language models and multimodal models.β23,905Updated this week
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.β24,500Aug 12, 2024Updated last year
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,528Updated this week
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,030Jan 23, 2026Updated last month
- Code and documentation to train Stanford's Alpaca models, and generate the data.β30,271Jul 17, 2024Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,843Jun 10, 2024Updated last year
- Accessible large language models via k-bit quantization for PyTorch.β7,997Feb 26, 2026Updated last week
- Fine-tuning & Reinforcement Learning for LLMs. π¦₯ Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.β53,029Updated this week
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β157,071Feb 27, 2026Updated last week
- Example models using DeepSpeedβ6,791Feb 7, 2026Updated 3 weeks ago
- Inference code for Llama modelsβ59,183Jan 26, 2025Updated last year
- Tools for merging pretrained large language models.β6,826Updated this week
- Reference implementation for DPO (Direct Preference Optimization)β2,859Aug 11, 2024Updated last year
- Instruct-tune LLaMA on consumer hardwareβ18,972Jul 29, 2024Updated last year
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We alsβ¦β18,220Nov 3, 2025Updated 4 months ago
- Making large AI models cheaper, faster and more accessibleβ41,364Updated this week
- A curated list of reinforcement learning with human feedback resources (continually updated)β4,306Dec 9, 2025Updated 2 months ago
- Go ahead and axolotl questionsβ11,395Updated this week
- Aligning pretrained language models with instruction data generated by themselves.β4,580Mar 27, 2023Updated 2 years ago
- [ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parametersβ5,933Mar 14, 2024Updated last year
- Minimal reproduction of DeepSeek R1-Zeroβ12,853Feb 27, 2026Updated last week
- LlamaIndex is the leading document agent and OCR platformβ47,374Updated this week
- Simple RL training for reasoningβ3,830Dec 23, 2025Updated 2 months ago
- AllenAI's post-training codebaseβ3,605Updated this week