huggingface / trlLinks
Train transformer language models with reinforcement learning.
β14,193Updated this week
Alternatives and similar repositories for trl
Users that are interested in trl are comparing it to the libraries listed below
Sorting:
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β18,774Updated last week
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,500Updated last year
- Fast and memory-efficient exact attentionβ17,846Updated this week
- An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Asyβ¦β7,075Updated last week
- Accessible large language models via k-bit quantization for PyTorch.β7,142Updated this week
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)β4,667Updated last year
- Tools for merging pretrained large language models.β5,829Updated this week
- A framework for few-shot evaluation of language models.β9,326Updated this week
- PyTorch native post-training libraryβ5,273Updated this week
- verl: Volcano Engine Reinforcement Learning for LLMsβ9,710Updated this week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.β12,293Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β8,839Updated this week
- Robust recipes to align language models with human and AI preferencesβ5,223Updated last month
- Go ahead and axolotl questionsβ9,610Updated this week
- [ICLR 2024] Efficient Streaming Language Models with Attention Sinksβ6,906Updated 11 months ago
- Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"β12,090Updated 6 months ago
- Reference implementation for DPO (Direct Preference Optimization)β2,609Updated 10 months ago
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.β22,795Updated 10 months ago
- Modeling, training, eval, and inference code for OLMoβ5,689Updated last week
- Large Language Model Text Generation Inferenceβ10,236Updated this week
- Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adβ¦β6,073Updated 9 months ago
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ21,420Updated 2 weeks ago
- β4,084Updated last year
- Example models using DeepSpeedβ6,539Updated this week
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We alsβ¦β17,490Updated this week
- [ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parametersβ5,882Updated last year
- A modular RL library to fine-tune language models to human preferencesβ2,313Updated last year
- Aligning pretrained language models with instruction data generated by themselves.β4,393Updated 2 years ago
- Ongoing research training transformer models at scaleβ12,600Updated this week
- Latest Advances on Multimodal Large Language Modelsβ15,578Updated this week