PyTorch Implementation of the Maximum a Posteriori Policy Optimisation
☆80Nov 19, 2022Updated 3 years ago
Alternatives and similar repositories for mpo
Users that are interested in mpo are comparing it to the libraries listed below
Sorting:
- Pytorch implementation of "Maximum a Posteriori Policy Optimization" with Retrace for Discrete gym environments☆29Sep 10, 2020Updated 5 years ago
- Simple implementation of V-MPO proposed in https://arxiv.org/abs/1909.12238☆48Nov 10, 2020Updated 5 years ago
- Minimal Decision Transformer Implementation written in Jax (Flax).☆17Aug 8, 2022Updated 3 years ago
- ☆23Jun 8, 2021Updated 4 years ago
- A PyTorch Implementation of PlaNet: A Deep Planning Network for Reinforcement Learning☆12Aug 31, 2020Updated 5 years ago
- ☆13Apr 25, 2024Updated last year
- ☆59Sep 22, 2022Updated 3 years ago
- Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO)☆16Oct 23, 2021Updated 4 years ago
- Reinforcement learning library for PyTorch.☆11Jun 15, 2018Updated 7 years ago
- V-MPO torch version with DMLab30 and GTrXL☆13Mar 1, 2021Updated 5 years ago
- [NeurIPS 2022] ASPiRe: Adaptive Skill Priors for Reinforcement Learning☆13Oct 19, 2022Updated 3 years ago
- Muesli RL algorithm implementation (PyTorch) (LunarLander-v2)☆19Mar 18, 2024Updated last year
- A simple, continuous-control environment for OpenAI Gym☆23Jan 1, 2023Updated 3 years ago
- The implementation of Discriminator Soft Actor Critic☆15Jan 25, 2020Updated 6 years ago
- A number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environm…☆43Sep 19, 2022Updated 3 years ago
- Source files to replicate experiments in my ICLR 2022 paper.☆71Jul 17, 2025Updated 7 months ago
- Open source code combining implementations of Upside Down Reinforcement Learning and Reward Conditioned Policies☆19Mar 10, 2021Updated 4 years ago
- ☆18Nov 4, 2021Updated 4 years ago
- Official codebase for Generating Diverse Cooperative Agents by Learning Incompatible Policies (notable-top-25% @ ICLR 2023)☆19May 10, 2024Updated last year
- Code to accompany the paper "Mismatched No More: Joint Model-Policy Optimization for Model-Based RL"☆21Oct 6, 2021Updated 4 years ago
- Code for "Planning Goals for Exploration", ICLR2023 Spotlight. An unsupervised RL agent for hard exploration tasks.☆82May 13, 2024Updated last year
- Model-based Policy Gradients☆32Mar 12, 2020Updated 5 years ago
- ☆16May 5, 2022Updated 3 years ago
- Episodic Control☆22Sep 20, 2022Updated 3 years ago
- Implementation of Proximal Policy Optimization in Jax+Flax☆21May 18, 2023Updated 2 years ago
- Change-Based Exploration Transfer☆35Apr 24, 2022Updated 3 years ago
- ☆93Jan 21, 2026Updated last month
- Distrax, but in equinox. Lightweight JAX library of probability distributions and bijectors.☆39Jan 16, 2026Updated last month
- Code for "SimbaV2: Hyperspherical Normalization for Scalable Deep Reinforcement Learning"☆91Nov 4, 2025Updated 4 months ago
- This repository is the official implementation of ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination. P…☆54Nov 22, 2025Updated 3 months ago
- Model Agnostic Meta Learning (MAML) implemented in Flax, the neural network library for JAX.☆21Sep 18, 2020Updated 5 years ago
- Simple single file implementations of Reinforcement Learning algorithms in Julia☆23Feb 15, 2025Updated last year
- Bayesian active RL (BARL) and trajectory information planning (TIP)☆26Oct 11, 2022Updated 3 years ago
- ☆23Aug 19, 2022Updated 3 years ago
- ☆10Jun 27, 2024Updated last year
- Attentional Mechanism incorporated in Asynchronous Advantage Actor Critic a3c/a2c deep mind☆10Jan 9, 2018Updated 8 years ago
- PyTorch implementation for all methods and environments in the paper "MIMEx: Intrinsic Rewards from Masked Input Modeling"☆16May 17, 2023Updated 2 years ago
- OpenAI Gym wrapper for the DeepMind Control Suite☆227May 19, 2024Updated last year
- Evaluation of TD-MPC2.☆21Jan 21, 2024Updated 2 years ago