huggingface / trlLinks
Train transformer language models with reinforcement learning.
β16,844Updated this week
Alternatives and similar repositories for trl
Users that are interested in trl are comparing it to the libraries listed below
Sorting:
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β20,382Updated 2 weeks ago
- An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & TIS & vLLM & Ray & Dynamic Samplingβ¦β8,692Updated 2 weeks ago
- A framework for few-shot evaluation of language models.β11,069Updated last week
- Fast and memory-efficient exact attentionβ21,401Updated this week
- Ongoing research training transformer models at scaleβ14,758Updated this week
- Accessible large language models via k-bit quantization for PyTorch.β7,867Updated 3 weeks ago
- verl: Volcano Engine Reinforcement Learning for LLMsβ17,954Updated this week
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,805Updated last year
- PyTorch native post-training libraryβ5,639Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,415Updated 2 weeks ago
- Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"β13,106Updated last year
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)β4,738Updated last year
- Large Language Model Text Generation Inferenceβ10,716Updated 2 weeks ago
- Robust recipes to align language models with human and AI preferencesβ5,466Updated 3 months ago
- Transformer related optimization, including BERT, GPTβ6,376Updated last year
- Tools for merging pretrained large language models.β6,647Updated 2 weeks ago
- Reference implementation for DPO (Direct Preference Optimization)β2,826Updated last year
- RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)β¦β14,263Updated 2 weeks ago
- General technology for enabling AI capabilities w/ LLMs and MLLMsβ4,238Updated 2 weeks ago
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.β7,466Updated this week
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.β8,852Updated last year
- Aligning pretrained language models with instruction data generated by themselves.β4,556Updated 2 years ago
- Example models using DeepSpeedβ6,750Updated 2 weeks ago
- A high-throughput and memory-efficient inference and serving engine for LLMsβ66,734Updated this week
- Retrieval and Retrieval-augmented LLMsβ11,082Updated 2 weeks ago
- Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (β¦β11,914Updated this week
- AllenAI's post-training codebaseβ3,488Updated this week
- A curated list of reinforcement learning with human feedback resources (continually updated)β4,249Updated 3 weeks ago
- Go ahead and axolotl questionsβ11,024Updated this week
- OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, β¦β6,502Updated this week