DHDev0 / MuzeroLinks

Pytorch Implementation of MuZero for gym environment. It support any Discrete , Box and Box2D configuration for the action space and observation space.

☆17

Alternatives and similar repositories for Muzero

Users that are interested in Muzero are comparing it to the libraries listed below

Sorting:

instadeepai / fastpbrl
Vectorization techniques for fast population-based training.
☆56Updated 2 years ago
DHDev0 / Muzero-unplugged
Pytorch Implementation of MuZero Unplugged for gym environment. This algorithm is capable of supporting a wide range of action and observ…
☆27Updated 2 years ago
RobertTLange / gymnax-blines
Baselines for gymnax 🤖
☆67Updated 2 years ago
jcoreyes / evolvingrl
Supplementary Data for Evolving Reinforcement Learning Algorithms
☆46Updated 4 years ago
DHDev0 / Stochastic-muzero
Pytorch Implementation of Stochastic MuZero for gym environment. This algorithm is capable of supporting a wide range of action and obser…
☆66Updated last year
vwxyzjn / cleanba
CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL
☆113Updated 10 months ago
MarcoMeter / neroRL
Deep Reinforcement Learning Framework done with PyTorch
☆36Updated 3 months ago
neuroevobench / neuroevobench
Neuroevolution Benchmark in JAX 🦕
☆39Updated last year
facebookresearch / mtenv
MultiTask Environments for Reinforcement Learning.
☆76Updated 2 years ago
xingchenwan / bgpbt
[AutoML'22] Bayesian Generational Population-based Training (BG-PBT)
☆28Updated 2 years ago
tuomaso / radial_rl
Code used in our paper "Robust Deep Reinforment Learning through Adversarial Loss"
☆33Updated last year
epignatelli / discovering-reinforcement-learning-algorithms
A Jax/Stax implementation of the general meta learning paper: Oh, J., Hessel, M., Czarnecki, W.M., Xu, Z., van Hasselt, H.P., Singh, S. a…
☆21Updated 4 years ago
Miffyli / nle-sample-factory-baseline
☆22Updated 3 months ago
JBLanier / stratego_env
Multi-Agent RL Environment for the Stratego Board Game (and variants)
☆34Updated last year
tianjunz / MADE
☆19Updated 3 years ago
microsoft / logrl
Logarithmic Reinforcement Learning
☆26Updated 2 years ago
keraJLi / synthetic-gymnax
Drop-in environment replacements that make your RL algorithm train faster.
☆21Updated last year
facebookresearch / level-replay
This code implements Prioritized Level Replay, a method for sampling training levels for reinforcement learning agents that exploits the …
☆86Updated 4 years ago
andyljones / boardlaw
Scaling scaling laws with board games.
☆49Updated last year
timoklein / alphazero-gym
AlphaZero for continuous control tasks
☆23Updated 2 years ago
Hwhitetooth / jax_muzero
An implementation of MuZero in JAX.
☆56Updated 2 years ago
Egiob / cfrx
cfrx is a collection of algorithms and tools for hardware-accelerated Counterfactual Regret Minimization (CFR) algorithms in Jax.
☆32Updated 10 months ago
kenjyoung / mctx_learning_demo
☆51Updated 2 years ago
rlai-lab / Regularized-GradientTD
Code repo for Gradient Temporal-Difference Learning with Regularized Corrections paper.
☆37Updated 4 years ago
HumanCompatibleAI / evaluating-rewards
Library to compare and evaluate reward functions
☆67Updated last year
alexis-jacq / LOLA_DiCE
Pytorch implementation of LOLA (https://arxiv.org/abs/1709.04326) using DiCE (https://arxiv.org/abs/1802.05098)
☆95Updated 6 years ago
bmazoure / ppo_jax
Jax implementation of Proximal Policy Optimization (PPO) specifically tuned for Procgen, with benchmarked results and saved model weights…
☆56Updated 2 years ago
Farama-Foundation / CrowdPlay
A web based platform for collecting human actions in reinforcement learning environments
☆30Updated last year
schmidtdominik / Rainbow
Rainbow DQN implementation accompanying the paper "Fast and Data-Efficient Training of Rainbow" which reaches 205.7 median HNS after 10M …
☆45Updated 3 years ago
linesd / tabular-methods
Tabular methods for reinforcement learning
☆38Updated 4 years ago