huggingface / trlLinks
Train transformer language models with reinforcement learning.
β15,739Updated this week
Alternatives and similar repositories for trl
Users that are interested in trl are comparing it to the libraries listed below
Sorting:
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β19,743Updated this week
- A framework for few-shot evaluation of language models.β10,270Updated this week
- Fast and memory-efficient exact attentionβ19,778Updated last week
- Ongoing research training transformer models at scaleβ13,755Updated this week
- Accessible large language models via k-bit quantization for PyTorch.β7,627Updated this week
- An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Asyβ¦β8,058Updated 2 weeks ago
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,680Updated last year
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,180Updated last week
- verl: Volcano Engine Reinforcement Learning for LLMsβ13,956Updated last week
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)β4,711Updated last year
- Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"β12,775Updated 9 months ago
- Large Language Model Text Generation Inferenceβ10,550Updated 3 weeks ago
- PyTorch native post-training libraryβ5,523Updated this week
- Aligning pretrained language models with instruction data generated by themselves.β4,487Updated 2 years ago
- General technology for enabling AI capabilities w/ LLMs and MLLMsβ4,144Updated 3 months ago
- Tools for merging pretrained large language models.β6,337Updated 3 weeks ago
- Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adβ¦β6,076Updated 3 months ago
- Example models using DeepSpeedβ6,686Updated last week
- [ICLR 2024] Efficient Streaming Language Models with Attention Sinksβ7,055Updated last year
- Reference implementation for DPO (Direct Preference Optimization)β2,745Updated last year
- SGLang is a fast serving framework for large language models and vision language models.β18,662Updated this week
- Robust recipes to align language models with human and AI preferencesβ5,386Updated last month
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.β7,134Updated last week
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We alsβ¦β17,925Updated this week
- A curated list of reinforcement learning with human feedback resources (continually updated)β4,153Updated 2 weeks ago
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.β23,663Updated last year
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.β4,951Updated 5 months ago
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.β12,817Updated last week
- RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)β¦β13,997Updated last week
- AllenAI's post-training codebaseβ3,222Updated this week