facebookresearch / xbanditsrl
Contextual Bandit Spectral Representation Learner
☆10Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for xbanditsrl
- Automatically generate simple meta-learning tasks from a very large space☆15Updated last year
- A2C is a special case of PPO!☆19Updated 2 years ago
- ☆14Updated 2 months ago
- Causal Analysis of Agent Behavior for AI Safety☆17Updated last year
- Generalised UDRL☆37Updated 2 years ago
- Evaluating different engineering tricks that make RL work☆15Updated 3 years ago
- Deep learning models for contextual multi-armed bandit setting☆12Updated 3 years ago
- ☆17Updated 2 years ago
- mplementation of Advantage Actor Critic (A2C) and Proximal Policy Optimization Algorithm (PPO) use the advantages of Tensorflow 2.x.☆9Updated 4 years ago
- [AutoML'22] Bayesian Generational Population-based Training (BG-PBT)☆26Updated 2 years ago
- Semi-Markov Afterstate Actor-Critic (SMAAC) with Maze☆10Updated 3 years ago
- Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories☆41Updated last year
- Pytorch implementation of the Deep Deterministic Policy Gradients for Continuous Control☆26Updated last year
- Official implementation of the NeurIPS 2023 paper "Discovering General Reinforcement Learning Algorithms with Adversarial Environment Des…☆22Updated 4 months ago
- Clockwork VAEs in JAX/Flax☆32Updated 3 years ago
- ☆9Updated 2 years ago
- ☆15Updated 9 months ago
- SkillHack: A Benchmark for Skill Transfer in Open-Ended Reinforcement Learning☆13Updated 2 years ago
- This is code to accompany the paper "Accelerating Exploration with Unlabeled Prior Data".☆20Updated 11 months ago
- ☆28Updated 2 years ago
- Implementation of CASCADE in Learning General World Models in a Handful of Reward-Free Deployments (NeurIPS 22).☆29Updated 2 years ago
- Gym wrapper for pysc2☆10Updated 2 years ago
- Jax implementation of Proximal Policy Optimization (PPO) specifically tuned for Procgen, with benchmarked results and saved model weights…☆49Updated 2 years ago
- TaskMet Task-driven Metric Learning for Model Learning☆18Updated 9 months ago
- Building blocks for productive research☆45Updated last week
- Reinforcement Learning Assembly☆92Updated 3 years ago
- ☆41Updated 2 months ago
- Contextual Bandits Action Elimination DQN☆19Updated 6 years ago
- Understanding RL vision Distill article☆23Updated last year