maxreciprocate / offline
Offline RL experiments
☆14Updated 2 years ago
Alternatives and similar repositories for offline:
Users that are interested in offline are comparing it to the libraries listed below
- [AutoML'22] Bayesian Generational Population-based Training (BG-PBT)☆26Updated 2 years ago
- Open source code for paper "On the Learning and Learnability of Quasimetrics".☆32Updated 2 years ago
- GPT implementation in Flax☆18Updated 3 years ago
- PyTorch Package For Quasimetric Learning☆41Updated 3 months ago
- Accelerated replay buffers in JAX☆41Updated 2 years ago
- ICML 2022: Learning Iterative Reasoning through Energy Minimization☆46Updated last year
- VC-FB and MC-FB algorithms from "Zero-Shot Reinforcement Learning from Low Quality Data" (NeurIPS 2024)☆13Updated last month
- Docker containers of baseline agents for the Crafter environment☆28Updated 3 years ago
- RAD: Reinforcement Learning with Augmented Data (code for state augmentation)☆11Updated 3 years ago
- Code repository complementing the ICLR 2021 paper "Unsupervised Object Keypoint Learning using Local Spatial Predictability" (https://arx…☆9Updated last month
- Codebase for "Uni[MASK]: Unified Inference in Sequential Decision Problems"☆54Updated 7 months ago
- Implements the Messenger environment and EMMA model.☆23Updated last year
- Minimal Decision Transformer Implementation written in Jax (Flax).☆17Updated 2 years ago
- Generalised UDRL☆37Updated 2 years ago
- Official codebase for Exact Energy-Guided Diffusion Sampling via Contrastive Energy Prediction (ICML 2023)☆45Updated last year
- Official code for "Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving", ICML 2021☆27Updated 3 years ago
- ☆15Updated last year
- This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity☆40Updated last year
- [NeurIPS'20] Code for the Paper Compositional Visual Generation and Inference with Energy Based Models☆44Updated last year
- Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients☆26Updated 5 months ago
- Adaptable Agent Populations via a Generative Model of Policies☆13Updated 3 years ago
- ☆18Updated last year
- Guide Your Agent with Adaptive Multimodal Rewards (NeurIPS 2023 Accepted)☆33Updated last year
- ☆29Updated 3 years ago
- Learning Robust Dynamics Through Variational Sparse Gating☆21Updated 2 years ago
- ☆30Updated 2 months ago
- ☆15Updated 2 years ago
- TPU pod commander is a package for managing and launching jobs on Google Cloud TPU pods.☆20Updated 7 months ago
- Building blocks for productive research☆50Updated 2 weeks ago
- ☆14Updated last year