edlanglois / mamdpLinks

Code for the paper "How RL Agents Behave When Their Actions Are Modified"

☆9

Alternatives and similar repositories for mamdp

Users that are interested in mamdp are comparing it to the libraries listed below

Sorting:

oxwhirl / opiq
Code for Optimistic Exploration even with a Pessimistic Initialisation
☆14Updated 5 years ago
toshikwa / rljax
A collection of RL algorithms written in JAX.
☆102Updated 3 years ago
rlai-lab / Regularized-GradientTD
Code repo for Gradient Temporal-Difference Learning with Regularized Corrections paper.
☆36Updated 4 years ago
jannerm / gamma-models
Code for the paper "Gamma-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction"
☆44Updated last year
behaviorguidedRL / BGRL
Open source demo for the paper Learning to Score Behaviors for Guided Policy Optimization
☆24Updated 5 years ago
joeybose / FloRL
Implicit Normalizing Flows + Reinforcement Learning
☆61Updated 6 years ago
facebookresearch / level-replay
This code implements Prioritized Level Replay, a method for sampling training levels for reinforcement learning agents that exploits the …
☆88Updated 4 years ago
DavidJanz / successor_uncertainties_atari
Code for paper "Successor Uncertainties: Exploration and Uncertainty in Temporal Difference Learning" by David Janz*, Jiri Hron*, Przemys…
☆21Updated 2 years ago
bstadie / krazyworld
krazy grid world
☆25Updated 5 years ago
philipjball / OffCon3
📴 OffCon^3: SOTA PyTorch SAC and TD3 Implementations (arxiv: 2101.11331)
☆24Updated 4 years ago
jonasrothfuss / model_ensemble_meta_learning
Implementation of the Model-Based Meta-Policy-Optimization (MB-MPO) algorithm
☆44Updated 6 years ago
HumanCompatibleAI / evaluating-rewards
Library to compare and evaluate reward functions
☆67Updated last year
henry-prior / jax-rl
JAX implementations of core Deep RL algorithms
☆81Updated 3 years ago
ElisevanderPol / symmetrizer
☆31Updated 4 years ago
facebookresearch / adversarially-motivated-intrinsic-goals
This repository contains code for the method and experiments of the paper "Learning with AMIGo: Adversarially Motivated Intrinsic Goals".
☆63Updated last year
mklissa / PPOC
Proximal Policy Option-Critic
☆25Updated 6 years ago
brain-research / mirage-rl
Code to reproduce the experiments in The Mirage of Action-Dependent Baselines in Reinforcement Learning.
☆17Updated 7 years ago
abbyvansoest / maxent
☆13Updated 6 years ago
nnaisense / MAGE
Learning Action-Value Gradients in Model-based Policy Optimization
☆31Updated 3 years ago
deep-skill-chaining / deep-skill-chaining
Implementation of the skill discovery algorithm described in ICLR submission "Option Discovery using Deep Skill Chaining"
☆29Updated 5 years ago
robintyh1 / icml2021-pengqlambda
Revisiting Peng's Q(lambda) for Modern Reinforcement Learning
☆16Updated 4 years ago
seungyulhan / disc
☆9Updated 2 years ago
uber-research / D3G
Estimating Q(s,s') with Deep Deterministic Dynamics Gradients
☆32Updated 5 years ago
philipjball / SAC_PyTorch
🧶 Minimal PyTorch Soft Actor Critic (SAC) implementation
☆39Updated 3 years ago
jinnaiyuu / Optimal-Options-ICML-2019
Code for generating options for planning and reinforcement learning
☆12Updated 4 years ago
kristychoi / pixel_exploration
PyTorch implementation of Count-Based Exploration with Neural Density Models
☆10Updated 7 years ago
yfletberliac / adversarially-guided-actor-critic
AGAC: Adversarially Guided Actor-Critic
☆48Updated 3 years ago
younggyoseo / RE3
RE3: State Entropy Maximization with Random Encoders for Efficient Exploration
☆69Updated 4 years ago
tesslerc / TD3-JAX
A JAX Implementation of the Twin Delayed DDPG Algorithm
☆35Updated 5 years ago
facebookresearch / svg
On the model-based stochastic value gradient for continuous reinforcement learning
☆56Updated 2 years ago