albertometelli / wqlLinks

☆9

Alternatives and similar repositories for wql

Users that are interested in wql are comparing it to the libraries listed below

Sorting:

sebascuri / hucrl
☆30Updated last year
behaviorguidedRL / BGRL
Open source demo for the paper Learning to Score Behaviors for Guided Policy Optimization
☆24Updated 5 years ago
mcmachado / count_based_exploration_sr
☆31Updated 6 years ago
zafarali / emdp
Easy MDPs and grid worlds with accessible transition dynamics to do exact calculations
☆49Updated 3 years ago
nnaisense / MAGE
Learning Action-Value Gradients in Model-based Policy Optimization
☆31Updated 3 years ago
RomainLaroche / SPIBB
Safe Policy Improvement with Baseline Bootstrapping
☆26Updated 5 years ago
RLAgent / state-marginal-matching
Efficient Exploration via State Marginal Matching (2019)
☆69Updated 6 years ago
jonasrothfuss / model_ensemble_meta_learning
Implementation of the Model-Based Meta-Policy-Optimization (MB-MPO) algorithm
☆44Updated 6 years ago
thanard / me-trpo
☆92Updated last year
WilsonWangTHU / POPLIN
☆99Updated 2 years ago
robintyh1 / onpolicybaselines
on-policy optimization baselines for deep reinforcement learning
☆30Updated 5 years ago
justinjfu / diagnosing_qlearning
Code for Diagnosing Bottlenecks in Deep Q-learning. Contains implementations of tabular environments plus solvers.
☆19Updated 6 years ago
RockySJ / ampo
☆15Updated 4 years ago
ElisevanderPol / mdp-homomorphic-networks
☆29Updated 4 years ago
yifan12wu / rl-laplacian
Learning Laplacian Representations in Reinforcement Learning
☆16Updated 4 years ago
stratisMarkou / sample-efficient-bayesian-rl
Source for the sample efficient tabular RL submission to the 2019 NIPS workshop on Biological and Artificial RL
☆25Updated 3 years ago
mklissa / PPOC
Proximal Policy Option-Critic
☆25Updated 6 years ago
facebookresearch / level-replay
This code implements Prioritized Level Replay, a method for sampling training levels for reinforcement learning agents that exploits the …
☆87Updated 4 years ago
arushijain94 / SafeOptionCritic
Safe Option-Critic: Learning Safety in the Option-Critic Architecture
☆20Updated 6 years ago
kavosh8 / Lip
☆13Updated 7 years ago
DavidJanz / successor_uncertainties_atari
Code for paper "Successor Uncertainties: Exploration and Uncertainty in Temporal Difference Learning" by David Janz*, Jiri Hron*, Przemys…
☆21Updated 2 years ago
nnaisense / MAX
Code for reproducing experiments in Model-Based Active Exploration, ICML 2019
☆79Updated 5 years ago
ermongroup / CalibratedModelBasedRL
Code for "Calibrated Model-Based Deep Reinforcement Learning", ICML 2019.
☆56Updated 6 years ago
uber-research / D3G
Estimating Q(s,s') with Deep Deterministic Dynamics Gradients
☆32Updated 5 years ago
pairlab / vagram
[ICLR 22] Value Gradient weighted Model-Based Reinforcement Learning.
☆24Updated 2 years ago
alversafa / option-critic-arch
Implementation of the Option-Critic Architecture
☆39Updated 6 years ago
mcmachado / options
☆43Updated 8 years ago
tgangwani / BMIL
Pytorch code for "Learning Belief Representations for Imitation Learning in POMDPs" (UAI 2019)
☆20Updated 2 years ago
dennisl88 / rand_param_envs
Random parameter environments using gym 0.7.4 and mujoco-py 0.5.7
☆20Updated 6 years ago
Ji4chenLi / Multi-Task-Batch-RL
☆26Updated 2 years ago