ALucek/GRPO-Training

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ALucek/GRPO-Training)

ALucek / GRPO-Training

An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning

☆38

Alternatives and similar repositories for GRPO-Training

Users that are interested in GRPO-Training are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Jaykef / Triton-nanoGPT
View on GitHub
Custom triton kernels for training Karpathy's nanoGPT.
☆19Oct 21, 2024Updated last year
kaueltzen / LLM_Hackathon_2024
View on GitHub
This repo contains code and data of our contribution to the 2024 LLM Hackathon, materials' property prediction from textual descriptions …
☆12May 9, 2024Updated 2 years ago
eloimoliner / unconditional-diff-STFT
View on GitHub
Unconditional music synthesis using a diffusion model in the STFT domain
☆12May 31, 2022Updated 4 years ago
evintunador / gpt-lab
View on GitHub
cheap & easy LLM experiments for amateurs (alpha)
☆25Nov 30, 2025Updated 7 months ago
codingthefuturewithai / yt-mastering-ai-coding-ai-journaling-app
View on GitHub
App built in the "Coding the Future With AI" YouTube tutorial series "Mastering AI Coding"
☆12Jan 5, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
jon-chun / GenAI-Multi-Agent-Networks-and-Digital-Twins
View on GitHub
Generative AI, Multi-Agent Systems (MAS), AI Research Methodology, Industry Best Practices, and The Future of Work (Kenyon College's Inte…
☆23Dec 22, 2025Updated 7 months ago
ALucek / swarm-meal-planner
View on GitHub
Exploring and demonstrating OpenAI's Swarm framework
☆20Oct 20, 2024Updated last year
zaydzuhri / token-order-prediction
View on GitHub
Landing repository for the paper "Predicting the Order of Upcoming Tokens Improves Language Modeling"
☆48May 13, 2026Updated 2 months ago
davanstrien / data-for-fine-tuning-llms
View on GitHub
☆80Jun 5, 2024Updated 2 years ago
PromtEngineer / RAG_with_DeepSeek_R1
View on GitHub
☆20Jun 28, 2025Updated last year
ekramasif / GeminiCoder
View on GitHub
Instantly convert ideas into app code with AI! This React app uses the Gemini API to generate and preview code from Markdown, making prot…
☆15Jun 21, 2026Updated last month
khanhvy31 / Rag-with-your-csv
View on GitHub
☆17Feb 22, 2025Updated last year
ThinamXx / cuda-mode
View on GitHub
Making of cuda kernel
☆17May 27, 2025Updated last year
ALucek / LLM-distillation-guide
View on GitHub
☆30Aug 5, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
kyegomez / MobileVLM
View on GitHub
Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant …
☆15Mar 11, 2024Updated 2 years ago
tejasmagia / DetectCarParkingSlot_Contest
View on GitHub
Detecting car parking slot on Open car park space
☆13Oct 21, 2019Updated 6 years ago
soheil-mp / Reinforcement-Learning-Algorithms
View on GitHub
Step by Step Reinforcement Learning Tutorials.
☆12Nov 19, 2022Updated 3 years ago
fionn / feynman
View on GitHub
Calculate allowed interactions in QED
☆10Nov 2, 2022Updated 3 years ago
wenlai-lavine / jola
View on GitHub
Code for ICML 2025 paper | Joint Localization and Activation Editing for Low-Resource Fine-Tuning
☆28Jun 18, 2025Updated last year
lsaint / python-docx-oss
View on GitHub
CRUD Word documents with Python
☆13Feb 5, 2026Updated 5 months ago
tokenbender / infinite
View on GitHub
a rubric driven prioritized replay rl algo to maximise continual learning
☆16Oct 12, 2025Updated 9 months ago
lucidrains / infini-transformer-pytorch
View on GitHub
Implementation of Infini-Transformer in Pytorch
☆112Jan 4, 2025Updated last year
penfever / TuneTables
View on GitHub
TuneTables is a tabular classifier that implements prompt tuning for frozen prior-fitted networks.
☆24Mar 31, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
canyuchen / ClinicalBench
View on GitHub
Code for the KDD'26 paper "ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?"
☆36Jun 29, 2026Updated 3 weeks ago
yhjo09 / videogan-pytorch
View on GitHub
☆10Apr 9, 2019Updated 7 years ago
AguirreNicolas / PPG2IABP
View on GitHub
☆15Apr 24, 2022Updated 4 years ago
cheng-zhao / libast
View on GitHub
C library for evaluating expressions with the abstract syntax tree.
☆15Aug 3, 2020Updated 5 years ago
zhenheny / rPPG_antispoofing
View on GitHub
Face anti-spoofing using rPPG
☆14Sep 28, 2017Updated 8 years ago
codingthefuturewithai / rag-retriever
View on GitHub
A Python application that loads and processes both web pages and local documents, indexing their content using embeddings, and enabling s…
☆27Jul 20, 2025Updated last year
kyegomez / SimpleMamba
View on GitHub
Implementation of a modular, high-performance, and simplistic mamba for high-speed applications
☆41Nov 11, 2024Updated last year
Joseph-Ellaway / Ramachandran_Plotter
View on GitHub
Program to plot a Ramachandran plot of all dihedral angles from a given PDB file. Background is empirically generated from the peptides …
☆13Feb 25, 2025Updated last year
kingrc15 / multimodal-clinical-pretraining
View on GitHub
This is the official code for "Multimodal Pretraining of Medical Time Series and Notes" at Machine Learning for Health 2023
☆21Jan 6, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
Laz4rz / mup
View on GitHub
Minimal (truly) muP implementation, consistent with TP4 and TP5 papers notation
☆14Jan 2, 2026Updated 6 months ago
flowerinthenight / luna
View on GitHub
In-memory OLAP SQL server for object storage data.
☆14Oct 15, 2025Updated 9 months ago
krishgoel / chronocept-baseline-models
View on GitHub
The official baseline implementations for Chronocept
☆10Mar 31, 2026Updated 3 months ago
THUKElab / CCL2023-CLTC-THU_KELab
View on GitHub
This repository open-sources our GEC system submitted by THU KELab (sz) in the CCL2023-CLTC Track 1: Multidimensional Chinese Learner Tex…
☆15Nov 25, 2023Updated 2 years ago
alejandro-ao / mcp-client-python
View on GitHub
☆69Apr 27, 2025Updated last year
vgtomahawk / Charmanteau-CamReady
View on GitHub
Code for "CharManteau: Character Embedding Models For Portmanteau Creation. EMNLP 2017. Varun Gangal*, Harsh Jhamtani*, Graham Neubig, Ed…
☆10Jun 20, 2019Updated 7 years ago
pierrel55 / llama_st
View on GitHub
Load and run Llama from safetensors files in C
☆15Oct 24, 2024Updated last year