Train transformer language models with reinforcement learning.
β18,349May 12, 2026Updated this week
Alternatives and similar repositories for trl
Users that are interested in trl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β21,092May 8, 2026Updated last week
- An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Asyβ¦β9,476May 7, 2026Updated last week
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)β4,746Jan 8, 2024Updated 2 years ago
- verl/HybridFlow: A Flexible and Efficient RL Post-Training Frameworkβ21,176May 9, 2026Updated last week
- A modular RL library to fine-tune language models to human preferencesβ2,387Mar 1, 2024Updated 2 years ago
- End-to-end encrypted cloud storage - Proton Drive β’ AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- A high-throughput and memory-efficient inference and serving engine for LLMsβ79,733Updated this week
- Fast and memory-efficient exact attentionβ23,736Updated this week
- Fully open reproduction of DeepSeek-R1β26,018Apr 2, 2026Updated last month
- Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)β71,225Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β42,337Updated this week
- Robust recipes to align language models with human and AI preferencesβ5,597Apr 8, 2026Updated last month
- Ongoing research training transformer models at scaleβ16,340Updated this week
- A framework for few-shot evaluation of language models.β12,490May 6, 2026Updated last week
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,471May 1, 2026Updated 2 weeks ago
- End-to-end encrypted cloud storage - Proton Drive β’ AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Large Language Model Text Generation Inferenceβ10,853Mar 21, 2026Updated last month
- Reference implementation for DPO (Direct Preference Optimization)β2,890Aug 11, 2024Updated last year
- SGLang is a high-performance serving framework for large language models and multimodal models.β27,836Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,678May 7, 2026Updated last week
- Code and documentation to train Stanford's Alpaca models, and generate the data.β30,253Jul 17, 2024Updated last year
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.β24,771Aug 12, 2024Updated last year
- Example models using DeepSpeedβ6,819Mar 30, 2026Updated last month
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,908Jun 10, 2024Updated last year
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,123Jan 23, 2026Updated 3 months ago
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β160,559Updated this week
- Accessible large language models via k-bit quantization for PyTorch.β8,197May 8, 2026Updated last week
- Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.β63,952May 9, 2026Updated last week
- Inference code for Llama modelsβ59,404Jan 26, 2025Updated last year
- A curated list of reinforcement learning with human feedback resources (continually updated)β4,363Dec 9, 2025Updated 5 months ago
- Instruct-tune LLaMA on consumer hardwareβ18,931Jul 29, 2024Updated last year
- Tools for merging pretrained large language models.β7,069May 6, 2026Updated last week
- Simple RL training for reasoningβ3,856Dec 23, 2025Updated 4 months ago
- Aligning pretrained language models with instruction data generated by themselves.β4,600Mar 27, 2023Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer β’ AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Code for the paper Fine-Tuning Language Models from Human Preferencesβ1,390Jul 25, 2023Updated 2 years ago
- Go ahead and axolotl questionsβ11,890May 10, 2026Updated last week
- Making large AI models cheaper, faster and more accessibleβ41,380Updated this week
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We alsβ¦β18,331Updated this week
- Minimal reproduction of DeepSeek R1-Zeroβ13,099Feb 27, 2026Updated 2 months ago
- AllenAI's post-training codebaseβ3,715May 10, 2026Updated last week
- Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-R1, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VLβ¦β14,122Updated this week