epignatelli / discovering-reinforcement-learning-algorithmsLinks

A Jax/Stax implementation of the general meta learning paper: Oh, J., Hessel, M., Czarnecki, W.M., Xu, Z., van Hasselt, H.P., Singh, S. and Silver, D., 2020. Discovering reinforcement learning algorithms. Advances in Neural Information Processing Systems, 33.

☆22

Alternatives and similar repositories for discovering-reinforcement-learning-algorithms

Users that are interested in discovering-reinforcement-learning-algorithms are comparing it to the libraries listed below

Sorting:

Hwhitetooth / jax_muzero
An implementation of MuZero in JAX.
☆56Updated 2 years ago
joeybose / FloRL
Implicit Normalizing Flows + Reinforcement Learning
☆61Updated 6 years ago
evgenii-nikishin / omd
JAX code for the paper "Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation"
☆43Updated 4 years ago
bstadie / krazyworld
krazy grid world
☆25Updated 5 years ago
behaviorguidedRL / BGRL
Open source demo for the paper Learning to Score Behaviors for Guided Policy Optimization
☆24Updated 5 years ago
zdhNarsil / Stochastic-Marginal-Actor-Critic
Official pytorch implementation for our ICLR 2023 paper "Latent State Marginalization as a Low-cost Approach for Improving Exploration".
☆24Updated 2 years ago
jannerm / gamma-models
Code for the paper "Gamma-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction"
☆44Updated last year
joelouismarino / variational_rl
Variational Reinforcement Learning
☆16Updated last year
instadeepai / fastpbrl
Vectorization techniques for fast population-based training.
☆56Updated 2 years ago
sail-sg / optim4rl
Optim4RL is a Jax framework of learning to optimize for reinforcement learning.
☆26Updated 8 months ago
Kaixhin / GUDRL
Generalised UDRL
☆37Updated 3 years ago
danijar / ninjax
General Modules for JAX
☆66Updated 4 months ago
uber-research / D3G
Estimating Q(s,s') with Deep Deterministic Dynamics Gradients
☆32Updated 5 years ago
ben-eysenbach / mnm
Code to accompany the paper "Mismatched No More: Joint Model-Policy Optimization for Model-Based RL"
☆20Updated 3 years ago
Miffyli / nle-sample-factory-baseline
☆22Updated 4 months ago
YyzHarry / SV-RL
[ICLR 2020, Oral] Harnessing Structures for Value-Based Planning and Reinforcement Learning
☆34Updated 5 years ago
facebookresearch / level-replay
This code implements Prioritized Level Replay, a method for sampling training levels for reinforcement learning agents that exploits the …
☆88Updated 4 years ago
danijar / crafter-baselines
Docker containers of baseline agents for the Crafter environment
☆28Updated 3 years ago
hr0nix / omega
A number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environm…
☆41Updated 2 years ago
brentyi / minGPT-flax
GPT implementation in Flax
☆18Updated 3 years ago
bmazoure / ppo_jax
Jax implementation of Proximal Policy Optimization (PPO) specifically tuned for Procgen, with benchmarked results and saved model weights…
☆57Updated 3 years ago
xingchenwan / bgpbt
[AutoML'22] Bayesian Generational Population-based Training (BG-PBT)
☆28Updated 2 years ago
montrealrobotics / iv_rl
IV-RL - Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation
☆40Updated 3 weeks ago
hr0nix / dejax
Accelerated replay buffers in JAX
☆43Updated 2 years ago
jonasrothfuss / model_ensemble_meta_learning
Implementation of the Model-Based Meta-Policy-Optimization (MB-MPO) algorithm
☆44Updated 6 years ago
subho406 / Recurrent-PPO-Jax
Implementation of Proximal Policy Optimization in Jax+Flax
☆20Updated 2 years ago
johanobandoc / revisiting_rainbow
Revisiting Rainbow
☆75Updated 4 years ago
ElisevanderPol / symmetrizer
☆31Updated 4 years ago
rlai-lab / Regularized-GradientTD
Code repo for Gradient Temporal-Difference Learning with Regularized Corrections paper.
☆36Updated 4 years ago
jcoreyes / evolvingrl
Supplementary Data for Evolving Reinforcement Learning Algorithms
☆46Updated 4 years ago