Train transformer language models with reinforcement learning.
β18,701Jun 24, 2026Updated this week
Alternatives and similar repositories for trl
Users that are interested in trl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β21,299Updated this week
- An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Asyβ¦β9,673Jun 17, 2026Updated last week
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)β4,749Jan 8, 2024Updated 2 years ago
- verl/HybridFlow: A Flexible and Efficient RL Post-Training Frameworkβ22,070Updated this week
- A modular RL library to fine-tune language models to human preferencesβ2,389Mar 1, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A high-throughput and memory-efficient inference and serving engine for LLMsβ83,677Updated this week
- Fast and memory-efficient exact attentionβ24,221Updated this week
- Fully open reproduction of DeepSeek-R1β26,329Apr 2, 2026Updated 2 months ago
- Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)β72,482Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β42,544Jun 18, 2026Updated last week
- Robust recipes to align language models with human and AI preferencesβ5,614May 26, 2026Updated 3 weeks ago
- Ongoing research training transformer models at scaleβ16,761Updated this week
- A framework for few-shot evaluation of language models.β13,024Jun 2, 2026Updated 3 weeks ago
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,486May 1, 2026Updated last month
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Large Language Model Text Generation Inferenceβ10,863Mar 21, 2026Updated 3 months ago
- Reference implementation for DPO (Direct Preference Optimization)β2,887Aug 11, 2024Updated last year
- SGLang is a high-performance serving framework for large language models and multimodal models.β29,460Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,737Updated this week
- Code and documentation to train Stanford's Alpaca models, and generate the data.β30,249Jul 17, 2024Updated last year
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.β24,881Aug 12, 2024Updated last year
- Example models using DeepSpeedβ6,823May 20, 2026Updated last month
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,933Jun 10, 2024Updated 2 years ago
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,151Jan 23, 2026Updated 5 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits β’ AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β161,885Updated this week
- Accessible large language models via k-bit quantization for PyTorch.β8,286Updated this week
- Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.β67,133Updated this week
- Inference code for Llama modelsβ59,461Jan 26, 2025Updated last year
- A curated list of reinforcement learning with human feedback resources (continually updated)β4,393May 20, 2026Updated last month
- Instruct-tune LLaMA on consumer hardwareβ18,913Jul 29, 2024Updated last year
- Tools for merging pretrained large language models.β7,173Jun 17, 2026Updated last week
- Simple RL training for reasoningβ3,865Dec 23, 2025Updated 6 months ago
- Aligning pretrained language models with instruction data generated by themselves.β4,598Mar 27, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Code for the paper Fine-Tuning Language Models from Human Preferencesβ1,394Jul 25, 2023Updated 2 years ago
- Go ahead and axolotl questionsβ12,082Updated this week
- Making large AI models cheaper, faster and more accessibleβ41,404May 25, 2026Updated last month
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We alsβ¦β18,378May 19, 2026Updated last month
- Minimal reproduction of DeepSeek R1-Zeroβ13,174Feb 27, 2026Updated 3 months ago
- AllenAI's post-training codebaseβ3,759Updated this week
- Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VLβ¦β14,561Jun 18, 2026Updated last week