moaradwan / deep-learning-contextual-banditsLinks
Deep learning models for contextual multi-armed bandit setting
☆13Updated 4 years ago
Alternatives and similar repositories for deep-learning-contextual-bandits
Users that are interested in deep-learning-contextual-bandits are comparing it to the libraries listed below
Sorting:
- Counterfactual Evaluation and Learning for Interactive Systems: Foundations, Implementations, and Recent Advances☆12Updated 2 years ago
- ☆11Updated last year
- Automatically generate simple meta-learning tasks from a very large space☆15Updated last year
- ☆13Updated last year
- [AutoML'22] Bayesian Generational Population-based Training (BG-PBT)☆28Updated 2 years ago
- Generalised UDRL☆37Updated 3 years ago
- Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…☆29Updated last year
- This is the pytorch implementation of the UAI2023 paper "A Trajectory is Worth Three Sentences: Multimodal Transformer for Offline Reinf…☆11Updated last year
- Vintix: Action Model via In-Context Reinforcement Learning - - — ICML 2025☆42Updated last month
- ☆30Updated 3 years ago
- Customizable RecSys Simulator for OpenAI Gym☆26Updated 3 years ago
- Source code for our paper "Pessimistic Decision-Making for Recommender Systems" published at ACM TORS, and RecSys 2021.☆11Updated 2 years ago
- ☆17Updated last year
- Implementation of CASCADE in Learning General World Models in a Handful of Reward-Free Deployments (NeurIPS 22).☆29Updated 2 years ago
- Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients☆26Updated 10 months ago
- Official Implementation of `An Optimisation Framework for Unsupervised Environment Design` from RLC 2025☆12Updated this week
- Scalable Opponent Shaping Experiments in JAX☆24Updated last year
- Offline evaluation of multi-armed bandit algorithms☆23Updated 4 years ago
- This is code to accompany the paper "Accelerating Exploration with Unlabeled Prior Data".☆25Updated last year
- Exploration into the Scaling Value Iteration Networks paper, from Schmidhuber's group☆36Updated 9 months ago
- Drop-in environment replacements that make your RL algorithm train faster.☆21Updated last year
- Official implementation of the NeurIPS 2023 paper "Discovering General Reinforcement Learning Algorithms with Adversarial Environment Des…☆30Updated last year
- Learn online intrinsic rewards from LLM feedback☆41Updated 7 months ago
- Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories☆42Updated 2 years ago
- ☆31Updated 2 years ago
- This code accompanies the paper "Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration."☆28Updated last week
- Official codebase for "Sampling For Learnability", published at NeurIPS 2024☆16Updated last month
- RecZilla: Metalearning for algorithm selection on Recommender Systems☆24Updated last year
- (ICML2022) Off-Policy Evaluation for Large Action Spaces via Embeddings☆20Updated 2 years ago
- VC-FB and MC-FB algorithms from "Zero-Shot Reinforcement Learning from Low Quality Data" (NeurIPS 2024)☆17Updated 6 months ago