tansey / tstd0Links
An experiment with Thompson sampling and TD(0) on a grid world variant
☆17Updated 11 years ago
Alternatives and similar repositories for tstd0
Users that are interested in tstd0 are comparing it to the libraries listed below
Sorting:
- Collaborative filtering with the GP-LVM☆25Updated 10 years ago
- ☆36Updated 10 years ago
- Empirical tests of various bandit algorithms.☆16Updated 10 years ago
- Epsilon-greedy, softmax and LinUCB contextual bandit implementations [recommender systems]☆49Updated 6 years ago
- ☆29Updated 7 years ago
- Implementation of Counterfactual risk minimization☆26Updated 8 years ago
- Sklearn implementation of GBM to predict mu(X) and std(X) on heteroscedastic data☆26Updated 9 years ago
- reinforcement learning. policy gradient. PCL☆37Updated 8 years ago
- The information sieve for discrete variables.☆36Updated 8 years ago
- Starter kit for getting started in the NIPS 2017 Criteo Ad Placement Challenge☆18Updated 7 years ago
- ☆58Updated 9 years ago
- Hybrid Linear UCB bandit learning algorithm L Li(2010) python code☆56Updated 9 years ago
- Bayesian Logistic Regression using Laplace approximations to the posterior.☆47Updated 8 years ago
- ADMM on Apache Spark☆31Updated 9 years ago
- Exponential family embeddings (Poisson or Bernoulli) for discrete data☆32Updated 6 years ago
- ☆46Updated 11 years ago
- Gopalan, P., Ruiz, F. J., Ranganath, R., & Blei, D. M. (2014). Bayesian Nonparametric Poisson Factorization for Recommendation Systems. I…☆15Updated 10 years ago
- various simple RNNs trained on synthetic grammars☆30Updated 9 years ago
- A Python library for reinforcement learning using Bayesian approaches☆54Updated 10 years ago
- ☆28Updated 6 years ago
- ☆11Updated 8 years ago
- A board game recommendation engine/model/website.☆39Updated 8 years ago
- Ordered Weighted L1 regularization for classification and regression in Python☆52Updated 6 years ago
- working example of a contextual multi-armed bandit☆55Updated 5 years ago
- simple python interface to SMAC.☆21Updated 7 years ago
- Semi-synthetic experiments to test several approaches for off-policy evaluation and optimization of slate recommenders.☆43Updated 7 years ago
- ☆12Updated 7 years ago
- Time Series Prediction: A Non Linear Approach with Neural Networks☆34Updated 8 years ago
- Atari gauntlet for RL agents☆29Updated 8 years ago
- Asynchronous Advantage Actor Critic☆20Updated 8 years ago