NTT123 / a0-jaxLinks

AlphaZero in JAX

☆78

Alternatives and similar repositories for a0-jax

Users that are interested in a0-jax are comparing it to the libraries listed below

Sorting:

kenjyoung / mctx_learning_demo
☆52Updated 2 years ago
lowrollr / turbozero
fast + parallel AlphaZero in JAX
☆97Updated 7 months ago
rlglab / minizero
MiniZero: An AlphaZero and MuZero Training Framework
☆96Updated last week
sotetsuk / pgx
♟️ Vectorized RL game environments in JAX
☆510Updated 5 months ago
vwxyzjn / cleanba
CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL
☆114Updated 11 months ago
Carbon225 / mctx-classic
Classic MCTS example with mctx
☆21Updated 2 years ago
kaesve / muzero
A clean implementation of MuZero and AlphaZero following the AlphaZero General framework. Train and Pit both algorithms against each othe…
☆160Updated 4 years ago
Hwhitetooth / jax_muzero
An implementation of MuZero in JAX.
☆56Updated 2 years ago
bwfbowen / muax
A project that provides help for using DeepMind's mctx on gym-style environments.
☆60Updated 8 months ago
hr0nix / dejax
Accelerated replay buffers in JAX
☆43Updated 2 years ago
coax-dev / coax
Modular framework for Reinforcement Learning in python
☆174Updated 2 years ago
danijar / ninjax
General Modules for JAX
☆66Updated 4 months ago
Reytuag / transformerXL_PPO_JAX
☆81Updated 9 months ago
facebookresearch / minimax
Efficient baselines for autocurricula in JAX.
☆191Updated 11 months ago
epignatelli / navix
Accelerated minigrid environments with JAX
☆139Updated last week
rystrauss / dopamax
Reinforcement learning in pure JAX.
☆13Updated 5 months ago
instadeepai / flashbax
⚡ Flashbax: Accelerated Replay Buffers in JAX
☆242Updated this week
DramaCow / jaxued
☆82Updated 4 months ago
MichaelTMatthews / Craftax
(Crafter + NetHack) in JAX. ICML 2024 Spotlight.
☆330Updated last month
henry-prior / jax-rl
JAX implementations of core Deep RL algorithms
☆81Updated 3 years ago
andyljones / boardlaw
Scaling scaling laws with board games.
☆51Updated 2 years ago
Egiob / cfrx
cfrx is a collection of algorithms and tools for hardware-accelerated Counterfactual Regret Minimization (CFR) algorithms in Jax.
☆34Updated 11 months ago
hr0nix / omega
A number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environm…
☆41Updated 2 years ago
RobertTLange / gymnax-blines
Baselines for gymnax 🤖
☆68Updated 2 years ago
bmazoure / ppo_jax
Jax implementation of Proximal Policy Optimization (PPO) specifically tuned for Procgen, with benchmarked results and saved model weights…
☆57Updated 3 years ago
Zeta36 / muzero
A simple implementation of MuZero algorithm for connect4 game
☆96Updated 4 years ago
instadeepai / sebulba
🪐 The Sebulba architecture to scale reinforcement learning on Cloud TPUs in JAX
☆58Updated last year
tuero / muzero-cpp
A C++ pytorch implementation of MuZero
☆39Updated last year
lowrollr / mctx-az
Monte Carlo tree search in JAX, with functionality to continue search from a previous subtree
☆20Updated 3 months ago
mttga / purejaxql
Simple single-file baselines for Q-Learning in pure-GPU setting
☆174Updated 4 months ago