gingkg / AlphaZero_Gomoku_PyTorchLinks
基于Pytorch, 使用强化学习(自博弈+MCTS)训练一个五子棋AI
☆26Updated 4 years ago
Alternatives and similar repositories for AlphaZero_Gomoku_PyTorch
Users that are interested in AlphaZero_Gomoku_PyTorch are comparing it to the libraries listed below
Sorting:
- AlphaGo-Zero-Gobang 是一个基于强化学习的五子棋(Gobang)模型,主要用以了解AlphaGo Zero的运行原理的Demo,即神经网络是如何指导MCTS做出决策的,以及如何自我对弈学习。源码+教程☆108Updated 4 months ago
- 本项目主要是采用蒙特卡洛搜索树与残差神经网络实现的一个可在小规模硬 件设施上短期训练一个拥有较强棋力的五子棋 AI。参考 AlphaGo Zero 原始论文 《Mastering the game of Go without human knowledge》实现的一个在五子…☆47Updated 3 years ago
- 腾讯开悟智能体比赛(王者荣耀AI比赛,稳定版)☆54Updated last month
- ☆390Updated last year
- Implement PPO algorithm on mujoco environment,such as Ant-v2, Humanoid-v2, Hopper-v2, Halfcheeth-v2.☆53Updated 5 years ago
- 强化学习经典算法(offline\online learning, q-learning, DQN)的实现在平衡杆游戏和几个Atari 游戏 (CartPole\Pong\Boxing\MsPacman)☆32Updated 7 years ago
- An easier PyTorch deep reinforcement learning library.☆239Updated 9 months ago
- ☆49Updated 5 months ago
- ☆229Updated 7 months ago
- 强化学习第二版习题解答与代码案例 Solutions and codes for Reinforcement Learning second edition☆161Updated 4 years ago
- ☆54Updated 8 months ago
- 使用alphazero算法打造属于你自己的象棋AI☆279Updated 3 years ago
- rl-papers☆48Updated 2 years ago
- A curated list of visual reinforcement learning resources☆404Updated last week
- Honor of Kings AI Open Environment of Tencent☆769Updated last year
- A non-embedded AI for Clash Royale based on RL and CV.☆320Updated last year
- ☆662Updated 2 years ago
- [ICLR 2025 Oral] PyTorch code for the paper "Open-World Reinforcement Learning over Long Short-Term Imagination"☆166Updated 3 months ago
- ☆90Updated 3 years ago
- 基于DQN的五子棋人机对弈☆59Updated 6 years ago
- basic algorithms of reinforcement learning☆213Updated 2 years ago
- ☆66Updated last year
- LLM-PySC2 is NKAI Decision Team and NUDT Decision Team's Python component of the StarCraft II LLM Decision Environment. It exposes Deepmi…☆138Updated 5 months ago
- PPO, DDPG, SAC implementation on mujoco environment☆118Updated 3 years ago
- OpenAI团队的深度强化学习教程中文版☆31Updated 5 years ago
- NeurIPS 2024 DACER☆142Updated last month
- The Code for Paper “Relay Hindsight Experience Replay: Self-Guided Continual Reinforcement Learning for Sequential Object Manipulation Ta…☆156Updated last year
- A collection of notes @SJTU-CSE, written by Yanjie Ze. 上海交通大学计算机系本科生复习笔记。在线浏览网站:https://zeyanjie.gitbook.io/yanjie-zes-note/☆21Updated 3 years ago
- Pessimistic Value Iteration for Multi-Task Data Sharing in Offline RL☆17Updated last year
- D3QN 强化学习打只狼☆29Updated 3 years ago