tigert1998 / rl-gobang
AlphaZero implementation on Gomoku
☆16Updated last month
Alternatives and similar repositories for rl-gobang:
Users that are interested in rl-gobang are comparing it to the libraries listed below
- train AI agents to master Free-style Gomoku(五子棋)☆13Updated last year
- An implementation of the Raft consensus protocol.☆14Updated 6 years ago
- This is the source code of Agar.io environment.☆23Updated 3 years ago
- A novel parallel UCT algorithm with linear speedup and negligible performance loss.☆116Updated 3 years ago
- RLA is a tool for managing your RL experiments automatically☆71Updated 2 years ago
- A modified Alphazero implementation with C++ where performance matters.☆17Updated last year
- A code implementation for our arXiv paper "Multi-agent Adhoc Team Play using Decompositional Q function"☆129Updated last year
- An asynchronous/parallel method of AlphaGo Zero algorithm with Gomoku☆202Updated last month
- lecture notes of probability notes☆17Updated 4 years ago
- GPU cluster kubernetes configurations and usages☆34Updated 3 years ago
- Source code for the paper "Divergence-Augmented Policy Optimization"☆37Updated 5 years ago
- CivRealm is an interactive environment for the open-source strategy game Freeciv-web based on Freeciv, a Civilization-inspired game.☆108Updated 6 months ago
- Implemention of the Decision-Pretrained Transformer (DPT) from the paper Supervised Pretraining Can Learn In-Context Reinforcement Learni…☆60Updated 10 months ago
- ☆97Updated 4 years ago
- [NeurIPS 2022] "NSNet: A General Neural Probabilistic Framework for Satisfiability Problems"☆18Updated 2 years ago
- Agent to play the game Hex, based on the Expert Iteration from the paper Thinking Fast and Slow with Deep Learning and Tree Search (NIPS …☆7Updated 6 years ago
- ☆25Updated 2 years ago
- Code for "Joint Policy Search for Collaborative Multi-agent Incomplete Information Games"☆51Updated last year
- ☆13Updated 2 years ago
- ☆16Updated 6 years ago
- ☆14Updated 3 months ago
- This is the source code of RPG (Reward-Randomized Policy Gradient)☆43Updated 2 years ago
- 这是参加顶会的会议纪要☆15Updated 5 years ago
- The Official Code for Offline Model-based Adaptable Policy Learning (NeurIPS'21 & TPAMI)☆23Updated last year
- Chinese Standard Mahjong Competition hosted by AILab in Peking University.☆99Updated 2 years ago
- ☆15Updated 6 years ago
- ☆16Updated 3 years ago
- Open source code for paper "Denoised MDPs: Learning World Models Better Than the World Itself"☆136Updated last year
- PyTorch implementation of Vanilla PG, TNPG, TRPO, PPO on Mujoco environment☆14Updated 6 years ago
- A collection of research and survey papers of hierarchical reinforcement learning (HRL).☆44Updated 4 years ago