huggingface / trlLinks
Train transformer language models with reinforcement learning.
β16,638Updated this week
Alternatives and similar repositories for trl
Users that are interested in trl are comparing it to the libraries listed below
Sorting:
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β20,268Updated this week
- An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & TIS & vLLM & Ray & Dynamic Samplingβ¦β8,586Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,377Updated this week
- A framework for few-shot evaluation of language models.β10,920Updated this week
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)β4,730Updated last year
- Fast and memory-efficient exact attentionβ21,067Updated this week
- Accessible large language models via k-bit quantization for PyTorch.β7,815Updated this week
- Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"β13,032Updated 11 months ago
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,785Updated last year
- General technology for enabling AI capabilities w/ LLMs and MLLMsβ4,226Updated this week
- Example models using DeepSpeedβ6,747Updated 2 months ago
- Ongoing research training transformer models at scaleβ14,493Updated this week
- verl: Volcano Engine Reinforcement Learning for LLMsβ17,415Updated this week
- Large Language Model Text Generation Inferenceβ10,693Updated this week
- Retrieval and Retrieval-augmented LLMsβ11,004Updated last month
- Reference implementation for DPO (Direct Preference Optimization)β2,805Updated last year
- Tools for merging pretrained large language models.β6,575Updated this week
- Aligning pretrained language models with instruction data generated by themselves.β4,541Updated 2 years ago
- [ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parametersβ5,926Updated last year
- Robust recipes to align language models with human and AI preferencesβ5,447Updated 3 months ago
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We alsβ¦β18,081Updated last month
- PyTorch native post-training libraryβ5,615Updated this week
- RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)β¦β14,220Updated this week
- AllenAI's post-training codebaseβ3,417Updated this week
- Latest Advances on Multimodal Large Language Modelsβ16,945Updated this week
- OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, β¦β6,413Updated this week
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.β7,377Updated this week
- A modular RL library to fine-tune language models to human preferencesβ2,374Updated last year
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ21,875Updated 5 months ago
- Simple RL training for reasoningβ3,802Updated 4 months ago