shuaili8 / Bandit_book_solutionsLinks

☆13

Alternatives and similar repositories for Bandit_book_solutions

Users that are interested in Bandit_book_solutions are comparing it to the libraries listed below

Sorting:

ArronDZhang / ROLeR
The official code for ROLeR from CIKM 2024
☆7Updated 7 months ago
ZIYU-DEEP / Awesome-Papers-on-Combinatorial-Semi-Bandit-Problems
A curated list on papers about combinatorial multi-armed bandit problems.
☆17Updated 4 years ago
Xiaoyinggit / ConUCB
☆11Updated 4 years ago
henryslzhao / RL4Recsys
paper list in the area of reinforcenment learning for recommendation systems
☆24Updated 4 years ago
zhaoyu-li / NSNet
[NeurIPS 2022] "NSNet: A General Neural Probabilistic Framework for Satisfiability Problems"
☆18Updated 2 years ago
tao-shen / EdgeRec
mcc_demo
☆10Updated 3 years ago
uclaml / NeuralUCB
☆35Updated 5 years ago
sjtuhuoda / LearningforAuctionPapers
☆17Updated 3 years ago
sauxpa / neural_exploration
Study NeuralUCB and regret analysis for contextual bandit with neural decision
☆95Updated 3 years ago
zoulixin93 / pseudo_dyna_q
☆14Updated 5 years ago
kantneel / causal-metarl
WIP implementation of https://arxiv.org/pdf/1901.08162.pdf
☆9Updated 5 years ago
chenhaokun / TPGR
python implementation of the TPGR
☆39Updated 6 years ago
LiuShuai26 / Distributed-RL
Distributed DRL by Ray and TensorFlow Tutorial.
☆10Updated 5 years ago
BetsyHJ / SOFA
A TensorFlow implementation of SOFA, the Simulator for OFfline LeArning and evaluation.
☆21Updated 4 years ago
lyeskhalil / CORL
☆25Updated 3 years ago
clvoloshin / constrained_batch_policy_learning
☆27Updated 5 years ago
tuomaso / radial_rl
Code used in our paper "Robust Deep Reinforment Learning through Adversarial Loss"
☆33Updated last year
dgliu / SIGIR20_KDCRec
Experiments codes for SIGIR '20 paper "A General Knowledge Distillation Framework for Counterfactual Recommendation via Uniform Data"
☆32Updated 5 years ago
BetsyHJ / RL4Rec
A toolkit of Reinforcement Learning based Recommendation (RL4Rec)
☆23Updated 3 years ago
PKU-RL / Literature
☆106Updated 4 years ago
danni9594 / ASMG
☆13Updated 3 years ago
usaito / icml2022-mips
(ICML2022) Off-Policy Evaluation for Large Action Spaces via Embeddings
☆20Updated 2 years ago
eyounx / PRR
Meta-Reinforcement Learning with Policy Residual Representation
☆11Updated 5 years ago
saisrivatsan / deep-opt-auctions
Implementation of Optimal Auctions through Deep Learning
☆128Updated 5 years ago
jparkerholder / PB2
Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.
☆20Updated 4 years ago
antoine-hochart / bandit_algo_evaluation
Offline evaluation of multi-armed bandit algorithms
☆23Updated 4 years ago
chaovven / maab
Code for "A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising" WSDM 2022
☆23Updated 3 years ago
Xi-L / STCH
☆12Updated last month
ustcljb / topK-off-policy-correction-REINFORCE
☆18Updated 4 years ago
KuNyaa / berkeleydeeprlcourse-homework-pytorch-solution
Solutions for CS294-112 Fall2018 assignments in Pytorch
☆20Updated 6 years ago