ALucek / GRPO-TrainingLinks
An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning
☆32Updated last month
Alternatives and similar repositories for GRPO-Training
Users that are interested in GRPO-Training are comparing it to the libraries listed below
Sorting:
- LLM reads a paper and produce a working prototype☆57Updated 2 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 11 months ago
- Simple GRPO scripts and configurations.☆58Updated 4 months ago
- Build a Recommendation System Agent using LATS Agent Approach☆30Updated 3 months ago
- rl from zero pretrain, can it be done? we'll see.☆56Updated this week
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆70Updated 6 months ago
- ☆51Updated 7 months ago
- A reasoning assistant for your STEM education☆19Updated 3 months ago
- ☆92Updated 3 months ago
- Agentic RAG to help you build a startup🚀☆44Updated 2 months ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆30Updated 2 months ago
- Simple examples using Argilla tools to build AI☆53Updated 7 months ago
- ☆54Updated 4 months ago
- Set of scripts to finetune LLMs☆37Updated last year
- Train your own SOTA deductive reasoning model☆94Updated 3 months ago
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆115Updated 4 months ago
- ☆46Updated 2 months ago
- Large Language Model (LLM) powered evaluator for Retrieval Augmented Generation (RAG) pipelines.☆28Updated last year
- ☆69Updated 4 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆71Updated this week
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆53Updated 4 months ago
- ☆29Updated last year
- An agent to generate stunning images :)☆19Updated last month
- A Qwen .5B reasoning model trained on OpenR1-Math-220k☆14Updated 4 months ago
- ☆50Updated 3 weeks ago
- ☆63Updated last month
- AI agent with RAG+ReAct on Indian Constitution & BNS☆66Updated 8 months ago
- Framework for building, orchestrating and deploying multi-agent systems. Managed by OpenAI Solutions team. Experimental framework.☆92Updated 8 months ago
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆71Updated 7 months ago
- Tutorial on how to create a ReAct agent without a LLM framework☆57Updated 11 months ago