huggingface / trlLinks
Train transformer language models with reinforcement learning.
β16,012Updated this week
Alternatives and similar repositories for trl
Users that are interested in trl are comparing it to the libraries listed below
Sorting:
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β19,900Updated this week
- An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Asyβ¦β8,228Updated last week
- A framework for few-shot evaluation of language models.β10,433Updated this week
- verl: Volcano Engine Reinforcement Learning for LLMsβ14,648Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,231Updated last week
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,710Updated last year
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)β4,716Updated last year
- Accessible large language models via k-bit quantization for PyTorch.β7,687Updated last week
- Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"β12,843Updated 10 months ago
- Fast and memory-efficient exact attentionβ20,151Updated this week
- PyTorch native post-training libraryβ5,547Updated last week
- Tools for merging pretrained large language models.β6,394Updated last month
- Ongoing research training transformer models at scaleβ13,976Updated this week
- Robust recipes to align language models with human and AI preferencesβ5,406Updated last month
- Large Language Model Text Generation Inferenceβ10,580Updated last month
- Reference implementation for DPO (Direct Preference Optimization)β2,760Updated last year
- Example models using DeepSpeedβ6,701Updated 2 weeks ago
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.β8,775Updated last year
- Aligning pretrained language models with instruction data generated by themselves.β4,507Updated 2 years ago
- Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adβ¦β6,079Updated 3 months ago
- A curated list of reinforcement learning with human feedback resources (continually updated)β4,184Updated last month
- A high-throughput and memory-efficient inference and serving engine for LLMsβ60,980Updated this week
- Latest Advances on Multimodal Large Language Modelsβ16,526Updated last week
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We alsβ¦β17,979Updated this week
- General technology for enabling AI capabilities w/ LLMs and MLLMsβ4,157Updated 3 months ago
- Retrieval and Retrieval-augmented LLMsβ10,725Updated last week
- AllenAI's post-training codebaseβ3,263Updated this week
- OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, β¦β6,204Updated this week
- SGLang is a fast serving framework for large language models and vision language models.β19,094Updated last week
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.β4,970Updated 6 months ago