yao62995 / trpoLinks

replicating paper “Trust Region Policy Optimization”

☆8

Alternatives and similar repositories for trpo

Users that are interested in trpo are comparing it to the libraries listed below

Sorting:

andrewliao11 / pytorch-a3c-mujoco
Implement A3C for Mujoco gym envs
☆72Updated 7 years ago
Feryal / a3c-mujoco
☆28Updated 7 years ago
ethanluoyc / e2c-pytorch
E2C implementation in PyTorch
☆43Updated 8 years ago
bstadie / third_person_im
third person imitation learning. Archival only.
☆76Updated 5 years ago
tmoer / multimodal_varinf
Code for paper "Learning Multimodal Transition Dynamics for Model-Based Reinforcement Learning".
☆35Updated 7 years ago
rddy / isql
Inferring beliefs about dynamics from behavior
☆29Updated 7 years ago
ShibiHe / Q-Optimality-Tightening
This is my implementation of the Optimality Tightening
☆37Updated 8 years ago
Breakend / OptionGAN
Code accompanying the OptionGAN paper.
☆44Updated 6 years ago
kimhc6028 / pytorch-noreward-rl
pytorch implementation of Curiosity-driven Exploration by Self-supervised Prediction
☆80Updated 6 years ago
ilyasu123 / trpo
☆19Updated 9 years ago
BerkeleyAutomation / DART
☆49Updated 5 years ago
KyriacosShiarli / taco
☆25Updated 6 years ago
jjkke88 / trpo
trust region policy optimization base on gym and tensorflow, can run in distribution mode
☆15Updated 8 years ago
DanielTakeshi / imitation
☆13Updated 8 years ago
ericjang / e2c
TensorFlow impementation of: Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images
☆65Updated 9 years ago
DartML / PPO-Stein-Control-Variate
Proximal Policy Optimization with Stein Control Variates:
☆33Updated 7 years ago
itaicaspi / mgail
Model-Based Generative Adversarial Imitation Learning
☆89Updated 4 years ago
EndingCredits / Neural-Episodic-Control
Implementation of Deepmind's Neural Episodic Control
☆58Updated 7 years ago
rlbayes / rllabplusplus
☆159Updated 7 years ago
martinseilair / dm_control2gym
OpenAI Gym Wrapper for DeepMind Control Suite
☆71Updated 3 years ago
aravindsrinivas / upn
☆33Updated 7 years ago
pathak22 / exploration-by-disagreement
[ICML 2019] TensorFlow Code for Self-Supervised Exploration via Disagreement
☆125Updated 6 years ago
Feryal / craft-env
☆44Updated 6 years ago
quanvuong / Supervised_Policy_Update
Code to reproduce Supervised Policy Update (ICLR 2019)
☆17Updated 2 years ago
junhyukoh / value-prediction-network
NIPS 2017 Value Prediction Network
☆166Updated 7 years ago
kindredresearch / arp
Autoregressive policies for continuous control reinforcement learning
☆32Updated 6 years ago
strin / curriculum-deep-RL
Design good curriculums for deep reinforcement learning
☆14Updated 9 years ago
nnaisense / MAX
Code for reproducing experiments in Model-Based Active Exploration, ICML 2019
☆79Updated 5 years ago
Nat-D / FeatureControlHRL
Feature Control as Intrinsic Motivation for Hierarchical Reinforcement Learning
☆80Updated 7 years ago
Shallow-Updates-for-Deep-RL / Shallow_Updates_for_Deep_RL
Official implementation for the paper: "Shallow Updates for Deep Reinforcement Learning"
☆18Updated 7 years ago