π A tiny single-file implementation of Group Relative Policy Optimization (GRPO) as introduced by the DeepSeekMath paper
β41Jun 28, 2025Updated 10 months ago
Alternatives and similar repositories for microGRPO
Users that are interested in microGRPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- βοΈ Sentence segmentation with wtpsplit's state-of-the-art Segment any Text (SaT) modelsβ38Oct 1, 2025Updated 7 months ago
- Aioli: A unified optimization framework for language model data mixingβ32Jan 17, 2025Updated last year
- Improving Your Model Ranking on Chatbot Arena by Vote Rigging (ICML 2025)β27Feb 25, 2025Updated last year
- Atari-style POMDPsβ28Apr 24, 2026Updated last week
- TD-Regularized Actor-Critic Methodsβ36Dec 26, 2019Updated 6 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- β17Sep 16, 2025Updated 7 months ago
- Reinforcement learning training framework for entity-gym environments.β17Mar 18, 2024Updated 2 years ago
- β14May 9, 2024Updated last year
- One click away from a locally downloaded, fine-tuned model, hosted on hugging face, with inference built in. In two hours.β24Nov 9, 2025Updated 5 months ago
- A first bare bones paralleled implementation of Go Explore as described by the Uber Engineering blog postβ46Jan 25, 2019Updated 7 years ago
- exemplar code to download all option chains for a symbol using pyetrade (V1 Etrade API)β11Sep 28, 2021Updated 4 years ago
- Implemention based on lightrag and nano-graphrag to connect with psqlβ15Oct 28, 2024Updated last year
- β14Aug 15, 2024Updated last year
- Highly scalable 2D JAX physics engine.β65Apr 20, 2026Updated last week
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- β15Jul 9, 2025Updated 9 months ago
- β12Jan 21, 2025Updated last year
- A framework for creating your own reinforcement learning environments using pybulletβ21Oct 7, 2019Updated 6 years ago
- Sparse Autoencoders (SAE) vs CLIP fine-tuning fun.β18Dec 19, 2024Updated last year
- Client SDK to automate stock and options tradingβ12May 20, 2024Updated last year
- β33Jun 24, 2024Updated last year
- β20Dec 16, 2023Updated 2 years ago
- A library for training crosscodersβ17May 28, 2025Updated 11 months ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"β190May 25, 2025Updated 11 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer β’ AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Kakao Mobility MCP Server for directions and transit informationβ11Sep 14, 2025Updated 7 months ago
- LLVM 2.9 branch with TI C64x backend.β11Oct 17, 2019Updated 6 years ago
- A gymnasium-compatible framework to create reinforcement learning (RL) environment for solving the optimal power flow (OPF) problem. Contβ¦β29Mar 22, 2025Updated last year
- run deepseek v3 on a single node. Drops unused experts from memory.β16Jan 26, 2025Updated last year
- β10Aug 27, 2019Updated 6 years ago
- Time-ordered UUIDv4β20Jun 10, 2024Updated last year
- μΌκ°νμ μ€μ ! Tritonβ16Feb 15, 2024Updated 2 years ago
- OAuth Login for Gradio. Supports multiple identity providers.β16Jan 20, 2025Updated last year
- The LLVM Symbolic Simulator, part of SAW.β22Jul 17, 2020Updated 5 years ago
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A simple yet fairly fast scheme byte code interpreter written in ANSI C.β14Mar 28, 2021Updated 5 years ago
- Large language model of Medical AI, General Medical AI (GMAI)β17Jan 30, 2024Updated 2 years ago
- β43Apr 13, 2026Updated 2 weeks ago
- β13Aug 4, 2022Updated 3 years ago
- [IROS 2025] SIME: Enhancing Policy Self-Improvement with Modal-level Explorationβ16Mar 2, 2026Updated 2 months ago
- RaccoonWSClient is a lightweight implementation of libwebsockets in C++β11Aug 15, 2019Updated 6 years ago
- A quick way to get started with Transformer Lensβ14Dec 13, 2023Updated 2 years ago