annieyan / Bandits-using-UCB-algorithmLinks
Thompson Sampling for Bandits using UCB policy
☆10Updated 8 years ago
Alternatives and similar repositories for Bandits-using-UCB-algorithm
Users that are interested in Bandits-using-UCB-algorithm are comparing it to the libraries listed below
Sorting:
- Deconfounding Reinforcement Learning in Observational Settings☆51Updated 6 years ago
- Contextual Bandit Algorithms (+Bandit Algorithms)☆22Updated 6 years ago
- Contextual bandit in python☆112Updated 4 years ago
- Thompson Sampling Tutorial☆56Updated 7 years ago
- ☆27Updated 6 years ago
- paper list in the area of reinforcenment learning for recommendation systems☆25Updated 5 years ago
- In this notebook several classes of multi-armed bandits are implemented. This includes epsilon greedy, UCB, Linear UCB (Contextual bandit…☆90Updated 5 years ago
- Implementation of the algorithm in Python 3, TensorFlow and OpenAI Gym☆178Updated 7 years ago
- Upper Confidence Tree Planner for ATARI games☆19Updated 9 years ago
- Study NeuralUCB and regret analysis for contextual bandit with neural decision☆99Updated 4 years ago
- Code associated with the NeurIPS19 paper "Weighted Linear Bandits in Non-Stationary Environments"☆17Updated 6 years ago
- The submission template for the Learning to Dispatch and Reposition Competition @ KDD2020.☆93Updated 4 years ago
- Non stationary bandit for experiments with Reinforcement Learning☆33Updated 8 years ago
- ☆33Updated 3 years ago
- Library of contextual bandits algorithms☆338Updated last year
- Play with the solutions to the multi-armed-bandit problem.☆416Updated last year
- Simple implementation of GP-UCB algorithm.☆54Updated 9 years ago
- Dynamic Pricing BwK Problem and Reinforcement Learning☆31Updated 7 years ago
- Code for the paper "Skynet: A Top Deep RL Agent in the Inaugural Pommerman Team Competition"☆37Updated 6 years ago
- RainBow, Tensorflow☆49Updated 7 years ago
- Maximum Causal Entropy Inverse Reinforcement Learning☆48Updated 7 years ago
- Implementing the Learning with Opponent Learning Awareness paper (https://blog.openai.com/learning-to-model-other-minds/)☆19Updated 7 years ago
- OPE Tools based on Empirical Study of Off Policy Policy Estimation paper.☆62Updated 3 years ago
- Implementation of DDPG (Modified from the work of Patrick Emami) - Tensorflow (no TFLearn dependency), Ornstein Uhlenbeck noise function,…☆64Updated 8 years ago
- Efficient Exploration through Bayesian Deep Q-Networks☆37Updated 7 years ago
- Implementation of proximal policy optimization(PPO) with tensorflow☆35Updated 7 years ago
- A Multi-agent Learning Framework☆62Updated 4 years ago
- ☆368Updated 5 years ago
- Implementation of Optimal Auctions through Deep Learning☆134Updated 6 years ago
- Materials for the Practical Sessions of the Reinforcement Learning Summer School 2019: Bandits, RL & Deep RL (PyTorch).☆90Updated 6 years ago