willccbb / trl
Train transformer language models with reinforcement learning.
☆18Updated last month
Alternatives and similar repositories for trl:
Users that are interested in trl are comparing it to the libraries listed below
- ☆21Updated 4 months ago
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆31Updated last month
- A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon.☆32Updated last month
- Simple examples using Argilla tools to build AI☆53Updated 4 months ago
- MCP Server to run python code locally☆49Updated 3 months ago
- ☆19Updated last week
- ☆16Updated 5 months ago
- ☆50Updated 4 months ago
- ☆61Updated last month
- ☆31Updated 2 months ago
- ☆36Updated last month
- Build a Recommendation System Agent using LATS Agent Approach☆28Updated last month
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆56Updated 2 weeks ago
- Example implementation of Iteration of Tought - Gives a star if you like the project☆39Updated 3 months ago
- Conduct in-depth research with AI-driven insights : DeepDive is a command-line tool that leverages web searches and AI models to generate…☆39Updated 7 months ago
- ☆29Updated last year
- Official code of the paper "SimGRAG: Leveraging Similar Subgraphs for Knowledge Graphs Driven Retrieval-Augmented Generation"☆105Updated 3 months ago
- ☆35Updated last week
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 2 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆105Updated 3 months ago
- Embed anything.☆29Updated 10 months ago
- Official Repo for The Paper "Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems"☆47Updated last month
- ☆61Updated 5 months ago
- Framework for building, orchestrating and deploying multi-agent systems. Managed by OpenAI Solutions team. Experimental framework.☆90Updated 5 months ago
- ☆86Updated last month
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆64Updated 5 months ago
- ☆78Updated last week
- Uses a Gradio interface to stream coding related responses from local and cloud based large language models. Pulls context from GitHub Re…☆20Updated 2 weeks ago
- This codebase demonstrates various DSPy functionalities through practical examples.☆36Updated last month
- ☆85Updated 2 months ago