willccbb / trlLinks
Train transformer language models with reinforcement learning.
☆19Updated 3 months ago
Alternatives and similar repositories for trl
Users that are interested in trl are comparing it to the libraries listed below
Sorting:
- ☆59Updated 2 weeks ago
- ☆50Updated this week
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆68Updated 6 months ago
- LLM reads a paper and produce a working prototype☆57Updated last month
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆67Updated 2 months ago
- Simple examples using Argilla tools to build AI☆53Updated 6 months ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆24Updated 2 months ago
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆32Updated 2 weeks ago
- accompanying material for sleep-time compute paper☆90Updated last month
- ☆67Updated 3 months ago
- ☆36Updated 4 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆31Updated last month
- ☆29Updated last year
- Build a Recommendation System Agent using LATS Agent Approach☆30Updated 3 months ago
- ☆46Updated this week
- A collection of example AI programs built using DSPy and maitained by the Langtrace AI team.☆28Updated 6 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆78Updated 2 months ago
- ☆21Updated 7 months ago
- ☆16Updated 7 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆60Updated last week
- ☆92Updated 2 months ago
- Synthetic data generation and benchmark implementation for "Episodic Memories Generation and Evaluation Benchmark for Large Language Mode…☆45Updated last month
- Automatic Prompt Optimization☆36Updated last year
- Source code for the collaborative reasoner research project at Meta FAIR.☆87Updated last month
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆53Updated 4 months ago
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆113Updated 3 months ago
- Train your own SOTA deductive reasoning model☆92Updated 3 months ago
- Large Language Model (LLM) powered evaluator for Retrieval Augmented Generation (RAG) pipelines.☆27Updated last year
- ☆40Updated last month
- One Line To Build Zero-Data Classifiers in Minutes☆55Updated 8 months ago