huggingface / trlLinks
Train transformer language models with reinforcement learning.
β16,844Updated this week
Alternatives and similar repositories for trl
Users that are interested in trl are comparing it to the libraries listed below
Sorting:
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β20,382Updated 2 weeks ago
- An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & TIS & vLLM & Ray & Dynamic Samplingβ¦β8,692Updated 2 weeks ago
- Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"β13,122Updated last year
- A framework for few-shot evaluation of language models.β11,069Updated 2 weeks ago
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,805Updated last year
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)β4,738Updated last year
- Fast and memory-efficient exact attentionβ21,401Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,415Updated 2 weeks ago
- Large Language Model Text Generation Inferenceβ10,716Updated 2 weeks ago
- Accessible large language models via k-bit quantization for PyTorch.β7,867Updated 3 weeks ago
- Retrieval and Retrieval-augmented LLMsβ11,082Updated 3 weeks ago
- verl: Volcano Engine Reinforcement Learning for LLMsβ17,954Updated this week
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We alsβ¦β18,136Updated 2 months ago
- PyTorch native post-training libraryβ5,639Updated last week
- Go ahead and axolotl questionsβ11,024Updated this week
- Ongoing research training transformer models at scaleβ14,758Updated last week
- Robust recipes to align language models with human and AI preferencesβ5,466Updated 3 months ago
- Tools for merging pretrained large language models.β6,647Updated 2 weeks ago
- Modeling, training, eval, and inference code for OLMoβ6,266Updated last month
- Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adβ¦β6,093Updated 6 months ago
- Reference implementation for DPO (Direct Preference Optimization)β2,826Updated last year
- OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, β¦β6,502Updated last week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.β13,071Updated this week
- AllenAI's post-training codebaseβ3,488Updated this week
- Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (β¦β11,914Updated last week
- A modular RL library to fine-tune language models to human preferencesβ2,376Updated last year
- Public repo for HF blog postsβ3,278Updated this week
- SGLang is a high-performance serving framework for large language models and multimodal models.β22,092Updated this week
- Aligning pretrained language models with instruction data generated by themselves.β4,563Updated 2 years ago
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β41,118Updated last week