yhyu13 / C51-DDPG

This is a TensorFlow implementation of DeepMind's A Distributional Perspective on Reinforcement Learning.(C51-DDPG)

☆10

Alternatives and similar repositories for C51-DDPG

Users that are interested in C51-DDPG are comparing it to the libraries listed below

Sorting:

abhishm / PGQ
PGQ is an approach to combine Policy Gradient and Q-Learning. This repository will contain an implementation of PGQ.
☆15Updated 8 years ago
DartML / PPO-Stein-Control-Variate
Proximal Policy Optimization with Stein Control Variates:
☆33Updated 7 years ago
russellmendonca / maesn_suite
☆43Updated 6 years ago
jvmncs / ParamNoise
A comparison of parameter space noise methods for exploration in deep reinforcement learning
☆27Updated 6 years ago
seungjaeryanlee / rl-exploration
Reinforcement Learning papers on exploration methods.
☆19Updated 3 years ago
Alfo5123 / Robust-Multitask-RL
Machine Learning Course Project Skoltech 2018
☆108Updated 6 years ago
Feryal / craft-env
☆44Updated 6 years ago
DavidJanz / successor_uncertainties_atari
Code for paper "Successor Uncertainties: Exploration and Uncertainty in Temporal Difference Learning" by David Janz*, Jiri Hron*, Przemys…
☆21Updated 2 years ago
TianhongDai / self-imitation-learning-pytorch
This is the pytorch implementation of ICML 2018 paper - Self-Imitation Learning.
☆66Updated 6 years ago
ruizhaogit / EnergyBasedPrioritization
Energy-Based Hindsight Experience Prioritization (CoRL 2018) Oral presentation (7%)
☆33Updated 6 years ago
vluzko / dac-iclr-reproducibility
ICLR Reproducibility Challenge for Discriminator-Actor-Critic
☆20Updated 6 years ago
Knoxantropicen / model-based-meta-rl
Self-implemented code for Model-Based Meta-Reinforcement Learning
☆17Updated 6 years ago
Santara / stochastic_value_gradient
Implementation of (Learning Continuous Control Policies by Stochastic Value Gradients)[https://arxiv.org/abs/1510.09142]
☆26Updated 3 years ago
joeybose / FloRL
Implicit Normalizing Flows + Reinforcement Learning
☆61Updated 5 years ago
uber-research / D3G
Estimating Q(s,s') with Deep Deterministic Dynamics Gradients
☆32Updated 5 years ago
YyzHarry / SV-RL
[ICLR 2020, Oral] Harnessing Structures for Value-Based Planning and Reinforcement Learning
☆34Updated 5 years ago
hiwonjoon / ICML2019-TREX
☆83Updated 4 years ago
junjungoal / IMPALA-pytorch
PyTorch IMPALA implementation
☆26Updated 5 years ago
llan-ml / tesp
Implementation of our paper "Meta Reinforcement Learning with Task Embedding and Shared Policy"
☆34Updated 5 years ago
NoListen / ERL
Exploration based Reinforcement Learning. (Montezuma Revenge)
☆14Updated 6 years ago
ahq1993 / inverse_rl
Adversarial Imitation Via Variational Inverse Reinforcement Learning
☆95Updated 5 years ago
dnddnjs / feudal-montezuma
Pytorch implementation of "FeUdal Networks for Hierarchical Reinforcement Learning" for Montezuma's Revenge
☆94Updated 2 years ago
ermongroup / MetaIRL
Meta-Inverse Reinforcement Learning with Probabilistic Context Variables
☆73Updated 2 years ago
dannysdeng / dqn-pytorch
PyTorch - Implicit Quantile Networks - Quantile Regression - C51
☆22Updated 5 years ago
RomainLaroche / SPIBB
Safe Policy Improvement with Baseline Bootstrapping
☆26Updated 5 years ago
flowersteam / geppg
☆35Updated 6 years ago
itaicaspi / mgail
Model-Based Generative Adversarial Imitation Learning
☆89Updated 4 years ago
xkianteb / dril
Disagreement-Regularized Imitation Learning
☆30Updated 3 years ago
behaviorguidedRL / BGRL
Open source demo for the paper Learning to Score Behaviors for Guided Policy Optimization
☆24Updated 4 years ago
ruizhaogit / mep
Maximum Entropy-Regularized Multi-Goal Reinforcement Learning (ICML 2019)
☆23Updated 5 years ago