Train transformer language models with reinforcement learning.
β17,781Mar 25, 2026Updated this week
Alternatives and similar repositories for trl
Users that are interested in trl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β20,841Mar 18, 2026Updated last week
- An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)β9,231Updated this week
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)β4,742Jan 8, 2024Updated 2 years ago
- verl: Volcano Engine Reinforcement Learning for LLMsβ20,097Updated this week
- A modular RL library to fine-tune language models to human preferencesβ2,383Mar 1, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling on Cloudways β’ AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- A high-throughput and memory-efficient inference and serving engine for LLMsβ74,135Updated this week
- Fast and memory-efficient exact attentionβ22,938Updated this week
- Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)β69,106Updated this week
- Fully open reproduction of DeepSeek-R1β25,953Nov 24, 2025Updated 4 months ago
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β41,869Mar 18, 2026Updated last week
- Robust recipes to align language models with human and AI preferencesβ5,535Sep 8, 2025Updated 6 months ago
- Ongoing research training transformer models at scaleβ15,744Mar 20, 2026Updated last week
- A framework for few-shot evaluation of language models.β11,802Mar 18, 2026Updated last week
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,445Jun 2, 2025Updated 9 months ago
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Large Language Model Text Generation Inferenceβ10,812Jan 8, 2026Updated 2 months ago
- SGLang is a high-performance serving framework for large language models and multimodal models.β24,829Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,580Updated this week
- Reference implementation for DPO (Direct Preference Optimization)β2,868Aug 11, 2024Updated last year
- Code and documentation to train Stanford's Alpaca models, and generate the data.β30,256Jul 17, 2024Updated last year
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.β24,603Aug 12, 2024Updated last year
- Example models using DeepSpeedβ6,807Mar 4, 2026Updated 3 weeks ago
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,858Jun 10, 2024Updated last year
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,059Jan 23, 2026Updated 2 months ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β158,424Updated this week
- Accessible large language models via k-bit quantization for PyTorch.β8,078Updated this week
- Unsloth Studio is a web UI for training and running open models like Qwen, DeepSeek, gpt-oss and Gemma locally.β57,673Updated this week
- Inference code for Llama modelsβ59,250Jan 26, 2025Updated last year
- A curated list of reinforcement learning with human feedback resources (continually updated)β4,331Dec 9, 2025Updated 3 months ago
- Instruct-tune LLaMA on consumer hardwareβ18,961Jul 29, 2024Updated last year
- Tools for merging pretrained large language models.β6,895Mar 15, 2026Updated last week
- Simple RL training for reasoningβ3,841Dec 23, 2025Updated 3 months ago
- Aligning pretrained language models with instruction data generated by themselves.β4,587Mar 27, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Code for the paper Fine-Tuning Language Models from Human Preferencesβ1,381Jul 25, 2023Updated 2 years ago
- Making large AI models cheaper, faster and more accessibleβ41,376Mar 16, 2026Updated last week
- Go ahead and axolotl questionsβ11,508Updated this week
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We alsβ¦β18,265Mar 3, 2026Updated 3 weeks ago
- Minimal reproduction of DeepSeek R1-Zeroβ12,963Feb 27, 2026Updated last month
- AllenAI's post-training codebaseβ3,643Updated this week
- Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, β¦β13,263Mar 20, 2026Updated last week