shuaili8 / OnlineLearningToRank
Bandit algorithms for online learning to rank
☆17Updated 5 years ago
Alternatives and similar repositories for OnlineLearningToRank:
Users that are interested in OnlineLearningToRank are comparing it to the libraries listed below
- ☆52Updated 5 years ago
- Off-policy Learning in Two-stage Recommender Systems. https://dl.acm.org/doi/pdf/10.1145/3366423.3380130☆28Updated 4 years ago
- Implementing LinUCB and HybridLinUCB in Python.☆47Updated 6 years ago
- Stream Data based News Recommendation - Contextual Bandit Approach☆48Updated 7 years ago
- ☆81Updated 6 years ago
- ☆35Updated 6 years ago
- ☆11Updated 5 years ago
- Mengting Wan, Julian McAuley, "Item Recommendation on Monotonic Behavior Chains", in Proc. of 2018 ACM Conference on Recommender Systems …☆55Updated 2 months ago
- A pytorch implementation of A Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation.☆39Updated 5 years ago
- Source code for our paper "Joint Policy-Value Learning for Recommendation" published at KDD 2020.☆22Updated last year
- ☆16Updated 4 years ago
- SCoRe is a sequential recommendation model with dual side neighbor-based collaborative filtering. Implementation of our WSDM 2020 paper.☆17Updated 5 years ago
- A set of RL experiments. Currently including: (1) the MDP rank experiment, based on policy gradient algorithm☆27Updated 3 years ago
- Pytorch implementation of λOpt: Learn to Regularize Recommender Models in Finer Levels, KDD 2019☆53Updated 4 years ago
- A TensorFlow implementation of SOFA, the Simulator for OFfline LeArning and evaluation.☆20Updated 4 years ago
- Code for paper "On Sampling Strategies for Neural Network-based Collaborative Filtering"☆39Updated 7 years ago
- Ranking-Critical Training for Collaborative Filtering☆36Updated 8 months ago
- Determinantal point processes for basket recommendations☆15Updated 6 years ago
- A python implementation of Dueling Bandit Gradient Descent (DBGD)☆22Updated 6 years ago
- Code for the experiments of Matrix Factorization Bandit☆24Updated 6 years ago
- Lifelong sequential modeling for user response prediction. A comprehensive evaluation framework for our SIGIR 2019 paper.☆102Updated 4 years ago
- Code for Policy Learning for Fairness in Ranking paper at NeurIPS 2019☆20Updated 2 years ago
- Contextual bandit algorithm called LinUCB / Linear Upper Confidence Bounds as proposed by Li, Langford and Schapire☆29Updated 2 years ago
- ☆16Updated 7 years ago
- This is an implementation of the Dual Learning Algorithm with multi-layer feed-forward neural network for online unbiased learning to ran…☆89Updated 2 years ago
- ☆17Updated 7 years ago
- ☆22Updated 2 years ago
- TEM: Tree-enhanced Embedding Model for Explainable Recommendation, WWW2018☆76Updated 5 years ago
- ☆63Updated 4 years ago
- Offline evaluation of multi-armed bandit algorithms☆22Updated 4 years ago