Train transformer language models with reinforcement learning.
β18,134Apr 22, 2026Updated this week
Alternatives and similar repositories for trl
Users that are interested in trl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β20,981Updated this week
- An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Asyβ¦β9,381Apr 19, 2026Updated last week
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)β4,745Jan 8, 2024Updated 2 years ago
- verl/HybridFlow: A Flexible and Efficient RL Post-Training Frameworkβ20,930Updated this week
- A modular RL library to fine-tune language models to human preferencesβ2,387Mar 1, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A high-throughput and memory-efficient inference and serving engine for LLMsβ77,531Updated this week
- Fast and memory-efficient exact attentionβ23,457Updated this week
- Fully open reproduction of DeepSeek-R1β26,004Apr 2, 2026Updated 3 weeks ago
- Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)β70,504Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β42,188Updated this week
- Robust recipes to align language models with human and AI preferencesβ5,572Apr 8, 2026Updated 2 weeks ago
- Ongoing research training transformer models at scaleβ16,145Updated this week
- A framework for few-shot evaluation of language models.β12,331Updated this week
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,451Jun 2, 2025Updated 10 months ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Large Language Model Text Generation Inferenceβ10,843Mar 21, 2026Updated last month
- SGLang is a high-performance serving framework for large language models and multimodal models.β26,397Updated this week
- Reference implementation for DPO (Direct Preference Optimization)β2,884Aug 11, 2024Updated last year
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,621Apr 17, 2026Updated last week
- Code and documentation to train Stanford's Alpaca models, and generate the data.β30,259Jul 17, 2024Updated last year
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.β24,707Aug 12, 2024Updated last year
- Example models using DeepSpeedβ6,819Mar 30, 2026Updated 3 weeks ago
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,892Jun 10, 2024Updated last year
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,107Jan 23, 2026Updated 3 months ago
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β159,742Updated this week
- Accessible large language models via k-bit quantization for PyTorch.β8,149Updated this week
- Web UI for training and running open models like Gemma 4, Qwen3.5, DeepSeek, gpt-oss locally.β62,269Updated this week
- Inference code for Llama modelsβ59,370Jan 26, 2025Updated last year
- A curated list of reinforcement learning with human feedback resources (continually updated)β4,352Dec 9, 2025Updated 4 months ago
- Instruct-tune LLaMA on consumer hardwareβ18,945Jul 29, 2024Updated last year
- Tools for merging pretrained large language models.β7,023Mar 15, 2026Updated last month
- Simple RL training for reasoningβ3,849Dec 23, 2025Updated 4 months ago
- Aligning pretrained language models with instruction data generated by themselves.β4,591Mar 27, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Code for the paper Fine-Tuning Language Models from Human Preferencesβ1,387Jul 25, 2023Updated 2 years ago
- Go ahead and axolotl questionsβ11,737Updated this week
- Making large AI models cheaper, faster and more accessibleβ41,375Apr 13, 2026Updated last week
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We alsβ¦β18,300Updated this week
- Minimal reproduction of DeepSeek R1-Zeroβ13,070Feb 27, 2026Updated 2 months ago
- AllenAI's post-training codebaseβ3,702Updated this week
- Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-R1, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VLβ¦β13,898Updated this week