A reinforcement learning agent that learns to solve mazes using Group Relative Policy Optimization (GRPO).
☆12Feb 9, 2025Updated last year
Alternatives and similar repositories for grpo-maze-solver
Users that are interested in grpo-maze-solver are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 🐭 A tiny single-file implementation of Group Relative Policy Optimization (GRPO) as introduced by the DeepSeekMath paper☆39Jun 28, 2025Updated 8 months ago
- Implementation of PPO for CartPole-v1☆10Jan 1, 2019Updated 7 years ago
- Hdl21 Schematics☆16Jan 24, 2024Updated 2 years ago
- 基于`Git`仓库存储的`Markdown`笔记应用☆22Nov 28, 2019Updated 6 years ago
- Snake's Food Hunt" is a competitive AI-driven game where two snakes learn to navigate, collect food, and avoid collisions using Deep Q-Le…☆10Nov 18, 2025Updated 4 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆18Apr 20, 2025Updated 11 months ago
- Public code for implementation and experiments with differentiable decision trees.☆32Oct 17, 2024Updated last year
- 此项目创建的初衷是为了帮助人工智能、自然语言处理和大语言模型相关背景的同学找工作使用,欢迎加入项目的建设和维护☆17Mar 30, 2025Updated 11 months ago
- Optimising electricity expenditure in an HVAC system under dynamic electricity pricing scheme and weather conditions using a DDPG model.☆27Feb 6, 2022Updated 4 years ago
- A floating offshore wind farm simulation and flow control framework using FLORIS, MoorPy, and deep reinforcement learning☆20Jan 28, 2026Updated last month
- Face Recognition Door Lock☆16Nov 22, 2022Updated 3 years ago
- ☆12Nov 12, 2022Updated 3 years ago
- 🕵 Code for our EMNLP 2025 Main paper: "FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games"☆25Dec 14, 2025Updated 3 months ago
- 2025ICASSP☆16Jun 23, 2025Updated 9 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Open-source code for paper CDT: Cascading Decision Trees for Explainable Reinforcement Learning☆38Oct 31, 2025Updated 4 months ago
- An open source robot reinforcement learing plantform using stable-baselines and OpenAI Gym☆10Mar 24, 2023Updated 3 years ago
- LLM Prompting for Text2SQL via Gradual SQL Reffnement☆15Feb 19, 2025Updated last year
- Solutions to neuralnetworksanddeeplearning.com☆14Dec 21, 2016Updated 9 years ago
- Vision-driven Autonomous Flight of UAV Along River Using Deep Reinforcement Learning with Dynamic Expert Guidance☆15Mar 8, 2025Updated last year
- Autonomous UAV navigation using Deep Reinforcement Learning (DQN). The UAV learns to efficiently navigate grid-based environments, avoid …☆14Feb 11, 2025Updated last year
- Llama-style transformer in PyTorch with multi-node / multi-GPU training. Includes pretraining, fine-tuning, DPO, LoRA, and knowledge dist…☆22Mar 10, 2026Updated 2 weeks ago
- ☆15Dec 29, 2020Updated 5 years ago
- ☆30Jun 12, 2025Updated 9 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Code for Sibling Rivalry and experiments presented in associated paper☆17May 1, 2025Updated 10 months ago
- Speech corpora for the speech recognition evaluation system☆19Mar 20, 2018Updated 8 years ago
- DepthNav is a research framework for developing and evaluating autonomous navigation policies, particularly for aerial robots in complex …☆29Nov 17, 2025Updated 4 months ago
- A command line tool for comparing JSON files by degree of similarity.☆12Oct 28, 2019Updated 6 years ago
- Systems Modeling. Learn a variety of systems, such as those involving mechanical, electrical, hydraulic, pneumatic systems, and mixtures …☆16Dec 20, 2017Updated 8 years ago
- One-Shot Unsupervised Cross Domain Detection☆13Nov 22, 2022Updated 3 years ago
- Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the p…☆13Jan 26, 2025Updated last year
- ☆14Oct 28, 2023Updated 2 years ago
- Inverse Reinforcement Learning via State Marginal Matching, CoRL 2020☆45Jul 19, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- 百度语音示例☆50Feb 28, 2018Updated 8 years ago
- 汽车出租小项目,使用ssm框架以及layui☆12Dec 16, 2022Updated 3 years ago
- A multi-agent reinforcement learning solution to Flatland3 challenge.☆18Feb 16, 2024Updated 2 years ago
- Value & Policy Iteration for the frozenlake environment of OpenAI☆15May 14, 2019Updated 6 years ago
- ☆11Jan 9, 2025Updated last year
- Models from paper Kišš, Martin, Michal Hradiš, and Oldřich Kodym. “Brno Mobile OCR Dataset.” International Conference on Document Analysi…☆29Jul 23, 2019Updated 6 years ago
- 基于 Android Studio 与 Java 的 Android 端游戏应用,是一个结合 RPG 与 GalGame 模式的解密攻略类游戏, 包含背包系统、地图系统、交易系统、存档系统等。☆21Mar 11, 2024Updated 2 years ago