5vision / uct_atari
uct tree search + supervised lerning for atari games
☆12Updated 8 years ago
Alternatives and similar repositories for uct_atari:
Users that are interested in uct_atari are comparing it to the libraries listed below
- Explore the optimization landscape for direct policy learning reinforcement learning.☆50Updated 6 years ago
- ☆35Updated 6 years ago
- DQV-Learning: a novel faster synchronous Deep Reinforcement Learning algorithm☆25Updated 2 years ago
- Tutorial on continuous control at Reinforcement Learning Summer School 2017.☆34Updated 7 years ago
- Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees☆93Updated 5 years ago
- Simple tools for statistical analyses in RL experiments☆66Updated 6 years ago
- This is my implementation of the Optimality Tightening☆37Updated 7 years ago
- Proximal Policy Optimization with Stein Control Variates:☆33Updated 7 years ago
- ☆34Updated 4 years ago
- Code to reproduce Supervised Policy Update (ICLR 2019)☆17Updated 2 years ago
- A working implementation of the Categorical DQN (Distributional RL).☆96Updated 6 years ago
- reinforcement learning. policy gradient. PCL☆37Updated 7 years ago
- ☆24Updated 9 years ago
- ☆44Updated 6 years ago
- Code accompanying the OptionGAN paper.☆43Updated 6 years ago
- Attempt at reinforcement learning with curiosity for Sonic the Hedgehog games. Number 149 on OpenAI retro contest leaderboard, but more w…☆32Updated 6 years ago
- TensorFlow A2C to solve Acrobot, with synchronized parallel environments☆35Updated 6 years ago
- Reason8.ai PyTorch solution for NIPS RL 2017 challenge☆84Updated 5 years ago
- Models built with TensorFlow☆25Updated 6 years ago
- Robust policy search algorithms which train on model ensembles☆28Updated 8 years ago
- ☆19Updated 8 years ago
- Code to reproduce the results in the "Unsupervised Learning of Goal Spaces for Intrinsically Motivated Exploration"☆21Updated 7 years ago
- Python implementation of tabular asynchronous actor critic☆11Updated 8 years ago
- Reinforcement learning benchmarking.☆39Updated 6 years ago
- TensorFlow implementation of Value Iteration Networks (VIN): Clean, Simple and Modular☆52Updated 7 years ago
- PyTorch implementation of Memory Augmented Self-Play☆50Updated 4 years ago
- ☆28Updated 5 years ago
- Comparison of bandit algorithms from the Reinforcement Learning bible.☆17Updated 6 years ago
- ☆17Updated 7 years ago
- ☆19Updated 5 years ago