π A tiny single-file implementation of Group Relative Policy Optimization (GRPO) as introduced by the DeepSeekMath paper
β42Jun 28, 2025Updated 11 months ago
Alternatives and similar repositories for microGRPO
Users that are interested in microGRPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Minimal hackable GRPO implementationβ341Jan 31, 2025Updated last year
- Improving Your Model Ranking on Chatbot Arena by Vote Rigging (ICML 2025)β27Feb 25, 2025Updated last year
- β14May 9, 2024Updated 2 years ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]β33Jan 23, 2025Updated last year
- A first bare bones paralleled implementation of Go Explore as described by the Uber Engineering blog postβ46Jan 25, 2019Updated 7 years ago
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- exemplar code to download all option chains for a symbol using pyetrade (V1 Etrade API)β11Sep 28, 2021Updated 4 years ago
- β14Aug 15, 2024Updated last year
- β12Jan 21, 2025Updated last year
- A framework for creating your own reinforcement learning environments using pybulletβ21Oct 7, 2019Updated 6 years ago
- ADAPTIVE RESONANCE THEORY. Gail A. Carpenter and Stephen Grossbergβ10Feb 10, 2015Updated 11 years ago
- URDF description of the JVRC humanoid modelβ15Jan 9, 2025Updated last year
- Client SDK to automate stock and options tradingβ12May 20, 2024Updated 2 years ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"β192May 25, 2025Updated last year
- A fully modular framework for modeling and optimizing analog neural networksβ21Jan 19, 2026Updated 4 months ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Sparse Embedding Compression for Scalable Retrieval in Recommender Systemsβ36Nov 21, 2025Updated 6 months ago
- Kakao Mobility MCP Server for directions and transit informationβ11Sep 14, 2025Updated 9 months ago
- run deepseek v3 on a single node. Drops unused experts from memory.β16Jan 26, 2025Updated last year
- β11Mar 23, 2022Updated 4 years ago
- Automatic Parallelism Using LLVMβ10Aug 2, 2014Updated 11 years ago
- Time-ordered UUIDv4β20Jun 10, 2024Updated 2 years ago
- β24Aug 26, 2017Updated 8 years ago
- β15Mar 3, 2025Updated last year
- μΌκ°νμ μ€μ ! Tritonβ16Feb 15, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- OAuth Login for Gradio. Supports multiple identity providers.β16Jan 20, 2025Updated last year
- The LLVM Symbolic Simulator, part of SAW.β22Jul 17, 2020Updated 5 years ago
- EARL: Editing with Autoregression and RLβ42Nov 21, 2025Updated 6 months ago
- β13Aug 4, 2022Updated 3 years ago
- β25Oct 3, 2023Updated 2 years ago
- β17Dec 14, 2022Updated 3 years ago
- Code repository for the paper "The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Leβ¦β14Jan 16, 2025Updated last year
- RaccoonWSClient is a lightweight implementation of libwebsockets in C++β11Aug 15, 2019Updated 6 years ago
- The Pair App is employed by the Agency of Learning for team management and communication.β11Apr 13, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A quick way to get started with Transformer Lensβ14Dec 13, 2023Updated 2 years ago
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.β14Mar 20, 2024Updated 2 years ago
- β28Jun 2, 2026Updated 2 weeks ago
- Tiny evaluation of leading LLMs on competitive programming problemsβ14Apr 10, 2026Updated 2 months ago
- β18Dec 5, 2017Updated 8 years ago
- Cloudflare Worker For Session Authenticationβ12Feb 4, 2023Updated 3 years ago
- 2018εΉ΄ζ₯ε£ε·₯η§εIV-EοΌζΊθ½ε°θ½¦ζΊε¨δΊΊβ10May 10, 2018Updated 8 years ago