deep-reinforcement-learning-book / Chapter15-AlphaZero
Chapter 15 AlphaZero in book Deep Reinforcement Learning: code example of AlphaZero solving Gomoku game.
☆31Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for Chapter15-AlphaZero
- A Multi-agent Learning Framework☆62Updated 3 years ago
- Source code for the paper "Divergence-Augmented Policy Optimization"☆37Updated 4 years ago
- ☆38Updated 2 months ago
- ☆97Updated 3 years ago
- ☆28Updated last year
- advantage actor-critic reinforcement learning for openai gym cartpole☆64Updated 7 years ago
- Unified Model-Free Hierarchical Reinforcement Learning Framework☆37Updated 5 years ago
- ☆25Updated 3 years ago
- A code implementation for our arXiv paper "Multi-agent Adhoc Team Play using Decompositional Q function"☆127Updated last year
- A pack of reinforcement learning algorithms.☆81Updated 3 years ago
- A new paper list for multi-agent reinforcement learning (actively maintained)☆25Updated 4 years ago
- ☆41Updated 2 years ago
- This is the source code of RPG (Reward-Randomized Policy Gradient)☆43Updated 2 years ago
- Assignments for CS294-112 Fall2018 in Pytorch☆63Updated 6 years ago
- Personal Repo to keep track of RL papers☆31Updated 3 years ago
- Multi-Agent Determinantal Q-Learning☆42Updated 2 years ago
- ☆18Updated 5 years ago
- ☆33Updated 6 years ago
- Efficient Reinforcement Learning with a Thought-Game for StarCraft☆46Updated last year
- This is an pytorch implementation of Distributed Proximal Policy Optimization(DPPO).☆61Updated 6 years ago
- Random Network Distillation(RND) algo in Pytorch☆48Updated 5 years ago
- ☆158Updated last year
- ☆80Updated 5 months ago
- RLA is a tool for managing your RL experiments automatically☆70Updated last year
- PyTorch Implementation of FeUdal Networks for Hierarchical Reinforcement Learning (FuNs), Vezhnevets et al. 2017.☆38Updated 4 years ago
- Implement PPO-clip and PPO-penalty on Atari, which is the only open source of PPO-penalty☆56Updated 5 years ago
- TD3, SAC, IQN, Rainbow, PPO, Ape-X and etc. in TF1.x☆62Updated 3 years ago
- ☆29Updated 2 years ago
- ☆121Updated 3 years ago
- Proximal Policy Optimization with Beta distribution - uses multi agent Unity ML Tennis☆28Updated 5 years ago