superlinear-ai/microGRPO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/superlinear-ai/microGRPO)

superlinear-ai / microGRPO

🐭 A tiny single-file implementation of Group Relative Policy Optimization (GRPO) as introduced by the DeepSeekMath paper

☆43

Alternatives and similar repositories for microGRPO

Users that are interested in microGRPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jeffasante / grpo-maze-solver
View on GitHub
A reinforcement learning agent that learns to solve mazes using Group Relative Policy Optimization (GRPO).
☆12Feb 9, 2025Updated last year
sdiehl / tiny-r1
View on GitHub
Recreating the minimal training methods of DeepSeek-R1 for small langauge models.
☆22Feb 10, 2025Updated last year
Vlsir / Hdl21Schematics
View on GitHub
Hdl21 Schematics
☆17Jan 24, 2024Updated 2 years ago
Masoudjafaripour / nanochat-VLM
View on GitHub
A minimal, hackable Vision-Language Model built on Karpathy’s nanochat — add image understanding and multimodal chat for under $200 in co…
☆24Jul 14, 2026Updated last week
Gitnoter / Gitnoter
View on GitHub
基于`Git`仓库存储的`Markdown`笔记应用
☆22Nov 28, 2019Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
open-thought / tiny-grpo
View on GitHub
Minimal hackable GRPO implementation
☆344Jan 31, 2025Updated last year
YanzhaoShi / HSENet
View on GitHub
The official code and model of HSENet: Hybrid Spatial Encoding Network for 3D Medical Vision-Language Understanding.
☆15Sep 19, 2025Updated 10 months ago
somilagg / MIMO-Radar-Reinforcement-Learning-Environment
View on GitHub
☆10Dec 19, 2019Updated 6 years ago
HazyResearch / aioli
View on GitHub
Aioli: A unified optimization framework for language model data mixing
☆33Jan 17, 2025Updated last year
sparisi / td-reg
View on GitHub
TD-Regularized Actor-Critic Methods
☆37Dec 26, 2019Updated 6 years ago
huiwy / reflection-on-trees
View on GitHub
☆14May 9, 2024Updated 2 years ago
dbpedia / RDF2text-GAN
View on GitHub
RDF -to- text generator, using GANs and reinforcement learning. For Google summer of code 2020.
☆14Mar 25, 2023Updated 3 years ago
entity-neural-network / enn-trainer
View on GitHub
Reinforcement learning training framework for entity-gym environments.
☆17Mar 18, 2024Updated 2 years ago
HarleyCoops / OneShotAquaRAT
View on GitHub
One click away from a locally downloaded, fine-tuned model, hosted on hugging face, with inference built in. In two hours.
☆24Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
tml-epfl / icl-alignment
View on GitHub
Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]
☆33Jan 23, 2025Updated last year
grubermeister / i4004
View on GitHub
My Python Intel 4004 Emulator
☆19Jan 29, 2016Updated 10 years ago
Dahoas / QDSyntheticData
View on GitHub
☆14Aug 15, 2024Updated last year
1rocketdude / pyetrade_option_chains
View on GitHub
exemplar code to download all option chains for a symbol using pyetrade (V1 Etrade API)
☆11Sep 28, 2021Updated 4 years ago
hooman650 / MedQwenReasoner
View on GitHub
A simple tutorial to add medical reasoning using GRPO
☆21Feb 10, 2025Updated last year
chang-github-00 / LLM-Predictive-Decoding
View on GitHub
☆16Jul 9, 2025Updated last year
jimysancho / graphrag-psql
View on GitHub
Implemention based on lightrag and nano-graphrag to connect with psql
☆15Oct 28, 2024Updated last year
sail-sg / Rigging-ChatbotArena
View on GitHub
Improving Your Model Ranking on Chatbot Arena by Vote Rigging (ICML 2025)
☆27Feb 25, 2025Updated last year
rcrowder / AdaptiveResonanceTheory
View on GitHub
ADAPTIVE RESONANCE THEORY. Gail A. Carpenter and Stephen Grossberg
☆10Feb 10, 2015Updated 11 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ryanhoque / thriftydagger
View on GitHub
Code for ThriftyDAgger
☆15Dec 29, 2021Updated 4 years ago
qpwo / dsv3-lowmem
View on GitHub
run deepseek v3 on a single node. Drops unused experts from memory.
☆16Jan 26, 2025Updated last year
Vivswan / AnalogVNN
View on GitHub
A fully modular framework for modeling and optimizing analog neural networks
☆21Jan 19, 2026Updated 6 months ago
zer0int / CLIP-SAE-finetune
View on GitHub
Sparse Autoencoders (SAE) vs CLIP fine-tuning fun.
☆18Dec 19, 2024Updated last year
McGill-NLP / VinePPO
View on GitHub
Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"
☆192May 25, 2025Updated last year
alexjordan / LLVM-TMS320C64X
View on GitHub
LLVM 2.9 branch with TI C64x backend.
☆11Oct 17, 2019Updated 6 years ago
lava-security-research / forge-framework
View on GitHub
Top 10 Data Centers & AI Infrastructure Security Risks
☆16Updated this week
ati-ozgur / RmSAT-CFAR
View on GitHub
☆24Aug 26, 2017Updated 8 years ago
tony23545 / DeepKoopman
View on GitHub
Use deep learning to learn Koopman operator and LQR for optimal control
☆18Sep 28, 2020Updated 5 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
CharlesCNorton / intel-4004-verified
View on GitHub
Formalizing the Intel 4004 microprocessor
☆25Jul 14, 2026Updated last week
GaloisInc / llvm-verifier
View on GitHub
The LLVM Symbolic Simulator, part of SAW.
☆22Jul 17, 2020Updated 6 years ago
SimonsRoad / DynaSLAMReview
View on GitHub
☆10Aug 27, 2019Updated 6 years ago
joleeson / JRC-AoI-multi
View on GitHub
Code for the paper "Learning to Schedule Joint Radar-Communication with Deep Multi-Agent Reinforcement Learning" as published in the IEEE…
☆27Aug 10, 2022Updated 3 years ago
seanzw / gem5-avx
View on GitHub
This adds partial support of AVX2 and AVX-512 to gem5.
☆15Dec 19, 2023Updated 2 years ago
JeanKaddour / WASAM
View on GitHub
Weight-Averaged Sharpness-Aware Minimization (NeurIPS 2022)
☆28Jan 13, 2023Updated 3 years ago
baidu / speech-samples
View on GitHub
百度语音示例
☆50Feb 28, 2018Updated 8 years ago