annieyan / Bandits-using-UCB-algorithmLinks
Thompson Sampling for Bandits using UCB policy
☆10Updated 7 years ago
Alternatives and similar repositories for Bandits-using-UCB-algorithm
Users that are interested in Bandits-using-UCB-algorithm are comparing it to the libraries listed below
Sorting:
- Contextual Bandit Algorithms (+Bandit Algorithms)☆22Updated 5 years ago
- Contextual bandit in python☆114Updated 3 years ago
- Implementations of basic concepts dealt under the Reinforcement Learning umbrella. This project is collection of assignments in CS747: F…☆17Updated 7 years ago
- Dynamic Pricing BwK Problem and Reinforcement Learning☆31Updated 6 years ago
- My solutions to Berkeley's CS294 (Deep Reinforcement Learning) Homework☆36Updated 7 years ago
- Non stationary bandit for experiments with Reinforcement Learning☆34Updated 8 years ago
- Duel_DDQN (Dueling Network Architectures + Double DQN) using Keras☆31Updated 9 years ago
- FEN Code☆38Updated 5 years ago
- Deep Q Network implements by Tensorflow☆25Updated 7 years ago
- RainBow, Tensorflow☆49Updated 7 years ago
- Python code for the post "Adversarial Bandits and the Exp3 Algorithm"☆51Updated 5 years ago
- Code associated with the NeurIPS19 paper "Weighted Linear Bandits in Non-Stationary Environments"☆17Updated 5 years ago
- Policy gradient reinforcement learning algorithm with importance sampling☆32Updated 7 years ago
- ☆27Updated 5 years ago
- Direct Gibbs sampling for DPMM using python.☆16Updated 8 years ago
- research and implementations of Deep RL agents and their applications☆51Updated 3 weeks ago
- reproduce some RL or Multi-Agent models☆35Updated 6 years ago
- Thompson Sampling Tutorial☆53Updated 6 years ago
- Code to reproduce Supervised Policy Update (ICLR 2019)☆17Updated 2 years ago
- Upper Confidence Tree Planner for ATARI games☆19Updated 9 years ago
- TensorFlow & Keras implementation of DQN with HER (Hindsight Experience Replay)☆40Updated 4 years ago
- Code for the paper "Skynet: A Top Deep RL Agent in the Inaugural Pommerman Team Competition"☆37Updated 6 years ago
- Code for Expert Supervised Reinforcement Learning☆10Updated 4 years ago
- ☆16Updated 6 years ago
- working example of a contextual multi-armed bandit☆55Updated 5 years ago
- Tensorflow + OpenAI Gym implementation of Deep Q-Network (DQN), Double DQN (DDQN), Dueling Network and Deep Deterministic Policy Gradient…☆77Updated 8 years ago
- ☆27Updated 6 years ago
- ☆43Updated last month
- ☆10Updated 8 years ago
- ☆38Updated 3 years ago