ondrejbiza / banditsLinks
Comparison of bandit algorithms from the Reinforcement Learning bible.
☆17Updated 7 years ago
Alternatives and similar repositories for bandits
Users that are interested in bandits are comparing it to the libraries listed below
Sorting:
- reinforcement learning. policy gradient. PCL☆37Updated 8 years ago
- ☆69Updated 7 years ago
- Deep Reinforcement Learning with Fined Grained Action Repetition☆22Updated 7 years ago
- Simple tools for statistical analyses in RL experiments☆67Updated 7 years ago
- PyTorch implementation of Memory Augmented Self-Play☆52Updated 5 years ago
- A working implementation of the Categorical DQN (Distributional RL).☆96Updated 7 years ago
- Our NIPS 2017: Learning to Run source code☆55Updated 2 years ago
- Adapting the AlphaZero algorithm to remove the need of execution traces to train NPI.☆79Updated 2 years ago
- DQV-Learning: a novel faster synchronous Deep Reinforcement Learning algorithm☆24Updated 2 years ago
- Explore the optimization landscape for direct policy learning reinforcement learning.☆51Updated 6 years ago
- Bandits Environments for the OpenAI Gym☆89Updated 5 years ago
- Combining deep learning and reinforcement learning.☆81Updated 4 years ago
- A Python library for reinforcement learning using Bayesian approaches☆53Updated 10 years ago
- Code to reproduce the results in the "Unsupervised Learning of Goal Spaces for Intrinsically Motivated Exploration"☆21Updated 7 years ago
- ☆28Updated 6 years ago
- ☆24Updated 10 years ago
- [DEPRECATED] Advantage Actor Critic model in PyTorch inspired by OpenAI baselines TensorFlow implementation☆53Updated 5 years ago
- Pytorch implementation of LOLA (https://arxiv.org/abs/1709.04326) using DiCE (https://arxiv.org/abs/1802.05098)☆96Updated 7 years ago
- Implementation of Model-Agnostic Meta-Learning (MAML) in Jax☆191Updated 3 years ago
- Asynchronous Advantage Actor Critic☆20Updated 9 years ago
- A collection of code investigating the use of information theory for abstractions in RL☆16Updated 7 years ago
- Implementation of Neural Episodic Control in Tensorflow☆27Updated 6 years ago
- ☆35Updated 7 years ago
- Replication of Uber Neuroevolution paper☆46Updated 7 years ago
- Implementation of Counterfactual risk minimization☆26Updated 8 years ago
- Code accompanying the OptionGAN paper.☆44Updated 7 years ago
- Models built with TensorFlow☆26Updated 7 years ago
- Skip Context Tree Switching - Reference Implementation☆51Updated 8 years ago
- Full World Models Implementation in Chainer☆168Updated 7 years ago
- Playground for reinforcement learning algorithms implemented in TensorFlow☆16Updated 9 years ago