mhyrzt / Simple-MADRL-ChessLinks

MADRL project solving chess environment using PPO with two different methods: 2 agents/networks and a single agent/network.

☆19

Alternatives and similar repositories for Simple-MADRL-Chess

Users that are interested in Simple-MADRL-Chess are comparing it to the libraries listed below

Sorting:

moduIo / Deep-Q-network
Keras implementation of DQN for the MsPacman-v0 OpenAI Gym environment.
☆37Updated 2 years ago
facebookresearch / how-to-autorl
Plug-and-play hydra sweepers for the EA-based multifidelity method DEHB and several population-based training variations, all proven to e…
☆81Updated last year
araffin / rlss23-dqn-tutorial
Deep Q-Network (DQN) and Fitted Q-Iteration (FQI) tutorial for RL Summer School 2023
☆73Updated 8 months ago
alirezakazemipour / Continuous-PPO
Proximal Policy Optimization (Continuous Version) in PyTorch.
☆29Updated 2 months ago
gkswamy98 / fast_irl
Contains implementation of the FILTER algorithm for exponentially faster inverse reinforcement learning.
☆50Updated 2 years ago
CursedSeraphim / icmppo
Proximal Policy Optimization(PPO) with Intrinsic Curiosity Module(ICM)
☆16Updated 3 years ago
xingchenwan / bgpbt
[AutoML'22] Bayesian Generational Population-based Training (BG-PBT)
☆28Updated 2 years ago
znowu / mirror-learning
The code for experiments conducted to verify the correctness of mirror learning.
☆11Updated 3 years ago
hmishfaq / LSAC
The official code release for "Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning", ICLR 2025
☆10Updated last month
smorad / graph-conv-memory-paper
Graph convolutional memory for reinforcement learning
☆23Updated 4 years ago
facebookresearch / ssorl
Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories
☆42Updated 2 years ago
zhougroup / IDAC
Implicit Distributional Actor Critic
☆11Updated 3 years ago
automl / HPO_for_RL
This is the code of reproducing the results of our paper: On the importance of Hyperparameter Optimization for Model-based Reinforcement …
☆15Updated 3 years ago
vwxyzjn / a2c_is_a_special_case_of_ppo
A2C is a special case of PPO!
☆22Updated 3 years ago
Mateus224 / Visual-Explanation-in-Deep-Reinforcement-Learning
This project visualizes the knowledge of an agent trained by Deep Reinforcement Learning (paper will be published) using Backpropagation,…
☆18Updated 5 years ago
Matt00n / PolicyGradientsJax
On-Policy Policy Gradient Algorithms in JAX
☆38Updated last year
OpenRL-Lab / TiZero
Code accompanying the paper "TiZero: Mastering Multi-Agent Football with Curriculum Learning and Self-Play" (AAMAS 2023) 足球游戏智能体
☆59Updated last year
PatrickKorus / mcts-general
General Python implementation of Monte Carlo Tree Search for the use with Open AI Gym environments.
☆40Updated 4 years ago
yiqiwang8177 / Official-codebase-for-Decision-Transducer
This is the pytorch implementation of the UAI2023 paper "A Trajectory is Worth Three Sentences: Multimodal Transformer for Offline Reinf…
☆11Updated last year
twitter-research / hyperbolic-rl
☆55Updated 2 years ago
microsoft / strategically_efficient_rl
More efficient exploration for reinforcement learning in two-player, zero-sum game
☆21Updated 11 months ago
joeybose / FloRL
Implicit Normalizing Flows + Reinforcement Learning
☆61Updated 6 years ago
jinPrelude / simple-es
Simple implementations of multi-agent evolutionary strategies using pytorch.
☆16Updated 3 years ago
billtubbs / gym-CartPole-bt-v0
A modified version of the cart-pole OpenAI Gym environment for testing different control policies
☆13Updated last year
vmartinezalvarez / DyNODE
DyNODE: Neural Ordinary Differential Equations for Dynamics Modeling in Continuous Control
☆23Updated 4 years ago
google-research / reincarnating_rl
[NeurIPS 2022] Open source code for reusing prior computational work in RL.
☆96Updated 2 years ago
mgroling / GymRubiksCube
OpenAi gym environment for the Rubik's Cube (3x3x3).
☆11Updated 2 years ago
robocin / rSoccer
🎳 Environments for Reinforcement Learning
☆59Updated last month
jsztompka / MultiAgent-PPO
Proximal Policy Optimization with Beta distribution - uses multi agent Unity ML Tennis
☆29Updated 6 years ago
RaghuHemadri / Multi-Agent-Reinforcement-Learning-Survey-Papers
☆33Updated 4 years ago