instadeepai / sebulba
πͺ The Sebulba architecture to scale reinforcement learning on Cloud TPUs in JAX
β46Updated 10 months ago
Related projects: β
- A collection of matrix games in JAXβ9Updated 8 months ago
- Accelerated replay buffers in JAXβ39Updated 2 years ago
- An Open-Ended Agentic Simulatorβ17Updated last month
- β56Updated 3 weeks ago
- Vectorization techniques for fast population-based training.β52Updated 2 years ago
- β59Updated last month
- β25Updated this week
- General Modules for JAXβ57Updated last month
- JAX implementations of various deep reinforcement learning algorithms.β18Updated 3 years ago
- Single-file SAC-N implementation on jax with flax and equinox. 10x faster than pytorchβ46Updated last year
- Jax-Baseline is a Reinforcement Learning implementation using JAX and Flax/Haiku libraries, mirroring the functionality of Stable-Baselinβ¦β33Updated last week
- CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRLβ102Updated 3 weeks ago
- A collection of RL algorithms written in JAX.β92Updated 2 years ago
- Conservative Q learning in Jaxβ49Updated last year
- JAX implementations of core Deep RL algorithmsβ79Updated 2 years ago
- JAX reimplementation of the DeepMind paper "Genie: Generative Interactive Environments"β27Updated last week
- β14Updated last month
- β141Updated 2 weeks ago
- β‘ Flashbax: Accelerated Replay Buffers in JAXβ195Updated 3 weeks ago
- Learning diverse options through the Laplacian representation.β22Updated 8 months ago
- Simple single-file baselines for Q-Learning in pure-GPU settingβ87Updated last month
- Accompanying Code for "Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning", ICML 2023β16Updated 8 months ago
- A tool for aggregating and plotting MARL experiment data.β57Updated 3 weeks ago
- cfrx is a collection of algorithms and tools for hardware-accelerated Counterfactual Regret Minimization (CFR) algorithms in Jax.β27Updated last month
- β41Updated last year
- Baselines for gymnax π€β57Updated last year
- Benchmarking RL generalization in an interpretable way.β128Updated 7 months ago
- Jax implementation of Proximal Policy Optimization (PPO) specifically tuned for Procgen, with benchmarked results and saved model weightsβ¦β49Updated 2 years ago
- An implementation of MuZero in JAX.β52Updated last year
- Skeleton for scalable and flexible Jax RL implementationsβ58Updated last year