🐭 A tiny single-file implementation of Group Relative Policy Optimization (GRPO) as introduced by the DeepSeekMath paper
☆39Jun 28, 2025Updated 8 months ago
Alternatives and similar repositories for microGRPO
Users that are interested in microGRPO are comparing it to the libraries listed below
Sorting:
- A reinforcement learning agent that learns to solve mazes using Group Relative Policy Optimization (GRPO).☆12Feb 9, 2025Updated last year
- Aioli: A unified optimization framework for language model data mixing☆32Jan 17, 2025Updated last year
- CFR implementation of a poker bot.☆12Feb 17, 2023Updated 3 years ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]☆32Jan 23, 2025Updated last year
- A collection of different PyTorch wrappers for training neural networks and reinforcement algorithms☆13Dec 15, 2022Updated 3 years ago
- Tutorial: Writing R and Python Packages with Multithreaded C++ Code using BLAS, AVX2/AVX512, OpenMP, C++11 Threads and Cuda GPU accelerat…☆13Nov 27, 2022Updated 3 years ago
- Minimal hackable GRPO implementation☆327Jan 31, 2025Updated last year
- ☆33Jun 24, 2024Updated last year
- Some microbenchmarks and design docs before commencement☆12Feb 1, 2021Updated 5 years ago
- exemplar code to download all option chains for a symbol using pyetrade (V1 Etrade API)☆10Sep 28, 2021Updated 4 years ago
- code for the ddp tutorial☆32Apr 9, 2022Updated 3 years ago
- ADAPTIVE RESONANCE THEORY. Gail A. Carpenter and Stephen Grossberg☆10Feb 10, 2015Updated 11 years ago
- The Pair App is employed by the Agency of Learning for team management and communication.☆10Apr 13, 2024Updated last year
- 🛠️BullMQ-inspired Rust library for advanced job & queue management with Redis. Supports job prioritization, delays, retries, workers, lo…☆11Feb 14, 2025Updated last year
- A QA system based on k8s-specific knowledge build on ChatGLM2-6B, serving by Ray.☆10Sep 14, 2023Updated 2 years ago
- A Maze Game Using HTML5 Canvas☆11Nov 30, 2015Updated 10 years ago
- 🌿快速生成文件夹目录结构,支持定义目录层级,支持生成到 markdown 文件。☆13Oct 19, 2022Updated 3 years ago
- LLVM 2.9 branch with TI C64x backend.☆11Oct 17, 2019Updated 6 years ago
- A Texas Holdem poker framework written in C++ 20.☆11Apr 23, 2023Updated 2 years ago
- Debiasing Through Data Attribution☆12May 23, 2024Updated last year
- PPH in C☆24Nov 21, 2025Updated 3 months ago
- 🚀全流程自己训练一个VLA 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!☆27Oct 16, 2025Updated 4 months ago
- Exploration of automated dataset selection approaches at large scales.☆52Mar 4, 2025Updated 11 months ago
- ☆13Jul 27, 2024Updated last year
- ☆11Nov 27, 2018Updated 7 years ago
- CFR-based Texas Hold'em AI☆11Jan 30, 2021Updated 5 years ago
- Gym wrapper for pysc2☆10Sep 16, 2022Updated 3 years ago
- Tool for migrating MongoDB contents to Solr for indexing written in Ruby☆17Aug 24, 2011Updated 14 years ago
- Agentic Keyframe Search for Video Question Answering☆16Apr 7, 2025Updated 10 months ago
- Swarm learning algorithm☆11Jun 2, 2021Updated 4 years ago
- nd009-cn-advanced-p5,针对Udacity CN MLND P5项目☆14Jun 27, 2022Updated 3 years ago
- An implementation of the AlphaZero algorithm for adversarial games to be used with the machine learning framework of your choice☆12Aug 30, 2020Updated 5 years ago
- MLflow App Using React, Hooks, RabbitMQ, FastAPI Server, Celery, Microservices☆11Sep 25, 2022Updated 3 years ago
- Example code for the NNGeometry PyTorch library☆10Aug 20, 2025Updated 6 months ago
- Graphical user interface for text-guided face editing☆11Jan 18, 2023Updated 3 years ago
- The Conceptual Coverage Across Languages Benchmark for Text-to-Image Models☆12Oct 28, 2024Updated last year
- Poker hand evaluation for Go☆12Feb 7, 2014Updated 12 years ago
- Tiny evaluation of leading LLMs on competitive programming problems☆14Nov 28, 2024Updated last year
- Lipschitz Lifelong RL☆11Nov 6, 2020Updated 5 years ago