๐ญ A tiny single-file implementation of Group Relative Policy Optimization (GRPO) as introduced by the DeepSeekMath paper
โ42Jun 28, 2025Updated 10 months ago
Alternatives and similar repositories for microGRPO
Users that are interested in microGRPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Recreating the minimal training methods of DeepSeek-R1 for small langauge models.โ22Feb 10, 2025Updated last year
- Aioli: A unified optimization framework for language model data mixingโ32Jan 17, 2025Updated last year
- โ10Dec 19, 2019Updated 6 years ago
- A Very Simple Vector Databaseโ15May 1, 2023Updated 3 years ago
- We open-source our layout level fast EM simulation tool, EMSim, to the public.โ15Feb 8, 2024Updated 2 years ago
- End-to-end encrypted cloud storage - Proton Drive โข AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- TD-Regularized Actor-Critic Methodsโ36Dec 26, 2019Updated 6 years ago
- โ18Sep 16, 2025Updated 8 months ago
- Create Custom GYM Environment for SUMO and reinforcement learning agantโ15May 5, 2023Updated 3 years ago
- A Deep-Reinforcement-Learning-Based Scheduler for FPGA HLSโ15Feb 27, 2021Updated 5 years ago
- RDF -to- text generator, using GANs and reinforcement learning. For Google summer of code 2020.โ14Mar 25, 2023Updated 3 years ago
- Reinforcement learning training framework for entity-gym environments.โ17Mar 18, 2024Updated 2 years ago
- โ14May 9, 2024Updated 2 years ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]โ32Jan 23, 2025Updated last year
- Implementation of Proximal Policy Optimization (PPO) for continuous action space (`Pendulum-v1` from gym) using tensorflow2.x and pytorchโฆโ11Aug 8, 2022Updated 3 years ago
- Managed Database hosting by DigitalOcean โข AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- FMCW LiDAR implementation in CARLA simulatorโ19Mar 18, 2024Updated 2 years ago
- A first bare bones paralleled implementation of Go Explore as described by the Uber Engineering blog postโ46Jan 25, 2019Updated 7 years ago
- How far can we go with an LLM for a classification problemโ24Nov 24, 2024Updated last year
- Creating an environment to quickly train a variety of Deep Reinforcement Learning algorithms on Street Fighter 2 using tournaments betweeโฆโ20Mar 25, 2023Updated 3 years ago
- exemplar code to download all option chains for a symbol using pyetrade (V1 Etrade API)โ11Sep 28, 2021Updated 4 years ago
- Implemention based on lightrag and nano-graphrag to connect with psqlโ15Oct 28, 2024Updated last year
- Code for ThriftyDAggerโ14Dec 29, 2021Updated 4 years ago
- โ14Aug 15, 2024Updated last year
- Highly scalable 2D JAX physics engine.โ67Apr 20, 2026Updated last month
- Deploy open-source AI quickly and easily - Special Bonus Offer โข AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- A simple tutorial to add medical reasoning using GRPOโ21Feb 10, 2025Updated last year
- โ15Jul 9, 2025Updated 10 months ago
- Postgres protocol support for finagleโ36Sep 4, 2013Updated 12 years ago
- SpatialThinker: Reinforcing 3D Reasoning in Multimodal LLMs via Spatial Rewardsโ38Jan 28, 2026Updated 3 months ago
- My Python Intel 4004 Emulatorโ19Jan 29, 2016Updated 10 years ago
- Sparse Autoencoders (SAE) vs CLIP fine-tuning fun.โ18Dec 19, 2024Updated last year
- Client SDK to automate stock and options tradingโ12May 20, 2024Updated 2 years ago
- โ33Jun 24, 2024Updated last year
- A library for training crosscodersโ17May 28, 2025Updated 11 months ago
- Managed hosting for WordPress and PHP on Cloudways โข AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"โ190May 25, 2025Updated 11 months ago
- The official repo of Qwen-VL (้ไนๅ้ฎ-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.โ17Jun 5, 2024Updated last year
- A fully modular framework for modeling and optimizing analog neural networksโ21Jan 19, 2026Updated 4 months ago
- Sparse Embedding Compression for Scalable Retrieval in Recommender Systemsโ35Nov 21, 2025Updated 6 months ago
- A gymnasium-compatible framework to create reinforcement learning (RL) environment for solving the optimal power flow (OPF) problem. Contโฆโ29Mar 22, 2025Updated last year
- Sketch and LSH Index library for Java, including OPH methods as well as the Lazo methodโ15Dec 24, 2023Updated 2 years ago
- โ11Mar 23, 2022Updated 4 years ago