huggingface / trlLinks
Train transformer language models with reinforcement learning.
β17,297Updated this week
Alternatives and similar repositories for trl
Users that are interested in trl are comparing it to the libraries listed below
Sorting:
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β20,587Updated this week
- Fast and memory-efficient exact attentionβ22,113Updated this week
- An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)β8,949Updated this week
- Accessible large language models via k-bit quantization for PyTorch.β7,939Updated 2 weeks ago
- Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"β13,233Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,830Updated last year
- A framework for few-shot evaluation of language models.β11,358Updated this week
- Ongoing research training transformer models at scaleβ15,100Updated last week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,486Updated this week
- verl: Volcano Engine Reinforcement Learning for LLMsβ18,963Updated this week
- Large Language Model Text Generation Inferenceβ10,749Updated last month
- PyTorch native post-training libraryβ5,660Updated last week
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)β4,741Updated 2 years ago
- Robust recipes to align language models with human and AI preferencesβ5,489Updated 5 months ago
- Tools for merging pretrained large language models.β6,761Updated last week
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We alsβ¦β18,190Updated 3 months ago
- Reference implementation for DPO (Direct Preference Optimization)β2,846Updated last year
- Latest Advances on Multimodal Large Language Modelsβ17,313Updated last week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ69,622Updated this week
- General technology for enabling AI capabilities w/ LLMs and MLLMsβ4,277Updated last month
- Example models using DeepSpeedβ6,779Updated last month
- OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, β¦β6,639Updated 2 weeks ago
- Go ahead and axolotl questionsβ11,251Updated this week
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,002Updated 2 weeks ago
- β4,112Updated last year
- Modeling, training, eval, and inference code for OLMoβ6,305Updated 2 months ago
- Aligning pretrained language models with instruction data generated by themselves.β4,573Updated 2 years ago
- Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adβ¦β6,092Updated 7 months ago
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.β13,137Updated this week
- Hackable and optimized Transformers building blocks, supporting a composable construction.β10,326Updated this week