A reinforcement learning agent that learns to solve mazes using Group Relative Policy Optimization (GRPO).
☆12Feb 9, 2025Updated last year
Alternatives and similar repositories for grpo-maze-solver
Users that are interested in grpo-maze-solver are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 🐭 A tiny single-file implementation of Group Relative Policy Optimization (GRPO) as introduced by the DeepSeekMath paper☆41Jun 28, 2025Updated 9 months ago
- Implementation of PPO for CartPole-v1☆10Jan 1, 2019Updated 7 years ago
- Hdl21 Schematics☆17Jan 24, 2024Updated 2 years ago
- 基于`Git`仓库存储的`Markdown`笔记应用☆22Nov 28, 2019Updated 6 years ago
- Snake's Food Hunt" is a competitive AI-driven game where two snakes learn to navigate, collect food, and avoid collisions using Deep Q-Le…☆10Nov 18, 2025Updated 5 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆18Apr 20, 2025Updated last year
- Public code for implementation and experiments with differentiable decision trees.☆32Oct 17, 2024Updated last year
- 此项目创建的初衷是为了帮助人工智能、自然语言处理和大语言模型相关背景的同学找工作使用,欢迎加入项目的建设和维护☆18Mar 30, 2025Updated last year
- Optimising electricity expenditure in an HVAC system under dynamic electricity pricing scheme and weather conditions using a DDPG model.☆27Feb 6, 2022Updated 4 years ago
- A floating offshore wind farm simulation and flow control framework using FLORIS, MoorPy, and deep reinforcement learning☆21Jan 28, 2026Updated 2 months ago
- Face Recognition Door Lock☆16Nov 22, 2022Updated 3 years ago
- ☆12Nov 12, 2022Updated 3 years ago
- Implementation of Hippoformer, Integrating Hippocampus-inspired Spatial Memory with Transformers☆50Feb 5, 2026Updated 2 months ago
- 🕵 Code for our EMNLP 2025 Main paper: "FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games"☆25Dec 14, 2025Updated 4 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- 2025ICASSP☆16Jun 23, 2025Updated 9 months ago
- Open-source code for paper CDT: Cascading Decision Trees for Explainable Reinforcement Learning☆39Oct 31, 2025Updated 5 months ago
- An open source robot reinforcement learing plantform using stable-baselines and OpenAI Gym☆10Mar 24, 2023Updated 3 years ago
- LLM Prompting for Text2SQL via Gradual SQL Reffnement☆15Feb 19, 2025Updated last year
- Vision-driven Autonomous Flight of UAV Along River Using Deep Reinforcement Learning with Dynamic Expert Guidance☆15Mar 8, 2025Updated last year
- Solutions to neuralnetworksanddeeplearning.com☆14Dec 21, 2016Updated 9 years ago
- Archer2.0 evolves from its predecessor by introducing ASPO, which overcomes fundamental PPO-Clip limitations to prevent premature converg…☆31Oct 10, 2025Updated 6 months ago
- Autonomous UAV navigation using Deep Reinforcement Learning (DQN). The UAV learns to efficiently navigate grid-based environments, avoid …☆15Feb 11, 2025Updated last year
- Llama-style transformer in PyTorch with multi-node / multi-GPU training. Includes pretraining, fine-tuning, DPO, LoRA, and knowledge dist…☆23Updated this week
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆15Dec 29, 2020Updated 5 years ago
- ☆32Jun 12, 2025Updated 10 months ago
- Code for Sibling Rivalry and experiments presented in associated paper☆18May 1, 2025Updated 11 months ago
- DepthNav is a research framework for developing and evaluating autonomous navigation policies, particularly for aerial robots in complex …☆32Nov 17, 2025Updated 5 months ago
- Speech corpora for the speech recognition evaluation system☆19Mar 20, 2018Updated 8 years ago
- A command line tool for comparing JSON files by degree of similarity.☆12Oct 28, 2019Updated 6 years ago
- Systems Modeling. Learn a variety of systems, such as those involving mechanical, electrical, hydraulic, pneumatic systems, and mixtures …☆16Dec 20, 2017Updated 8 years ago
- One-Shot Unsupervised Cross Domain Detection☆13Nov 22, 2022Updated 3 years ago
- WebResearcher: An Iterative Deep-Research Agent,迭代式深度研究智能体☆49Feb 13, 2026Updated 2 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the p…☆13Jan 26, 2025Updated last year
- ☆14Oct 28, 2023Updated 2 years ago
- Inverse Reinforcement Learning via State Marginal Matching, CoRL 2020☆45Jul 19, 2023Updated 2 years ago
- 百度语音示例☆50Feb 28, 2018Updated 8 years ago
- 汽车出租小项目,使用ssm框架以及layui☆12Dec 16, 2022Updated 3 years ago
- A multi-agent reinforcement learning solution to Flatland3 challenge.☆18Feb 16, 2024Updated 2 years ago
- Value & Policy Iteration for the frozenlake environment of OpenAI☆15May 14, 2019Updated 6 years ago