henrycharlesworth / multi_action_head_PPO
PPO with multi-head/autoregressive action outputs
☆39Updated 4 years ago
Alternatives and similar repositories for multi_action_head_PPO:
Users that are interested in multi_action_head_PPO are comparing it to the libraries listed below
- Collection of OpenAI parametrized action-space environments.☆64Updated 2 weeks ago
- Source Code for A Closer Look at Invalid Action Masking in Policy Gradient Algorithms☆152Updated last year
- A Modular Library for Off-Policy Reinforcement Learning with a focus on SafeRL and distributed computing☆133Updated 8 months ago
- ☆48Updated 3 years ago
- Distributional Soft Actor Critic☆52Updated 4 years ago
- PyTorch implementation of the discrete Soft-Actor-Critic algorithm.☆50Updated 3 years ago
- PyTorch implementation of Constrained Policy Optimization☆53Updated 3 years ago
- Modified versions of the SAC algorithm from spinningup for discrete action spaces and image observations.☆94Updated 4 years ago
- DEPRECATED - please visit https://github.com/vwxyzjn/ppo-implementation-details☆46Updated 2 years ago
- There will be updates later☆84Updated 5 years ago
- Code for Weighted QMIX☆134Updated 4 years ago
- ☆74Updated 10 months ago
- Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill Discovery☆99Updated 2 years ago
- Implementation of HIRO (Data-Efficient Hierarchical Reinforcement Learning)☆106Updated 3 years ago
- PyTorch Implementation of FeUdal Networks for Hierarchical Reinforcement Learning (FuNs), Vezhnevets et al. 2017.☆39Updated 4 years ago
- Submission for MAVEN: Multi-Agent Variational Exploration☆57Updated 2 years ago
- Single-file pytorch implementation of hybrid-SAC☆55Updated 3 years ago
- DecentralizedLearning☆24Updated 2 years ago
- Pytorch implementation of "Safe Exploration in Continuous Action Spaces" [Dalal et al.]☆70Updated 5 years ago
- Learning Individual Intrinsic Reward in MARL☆62Updated 2 years ago
- Codes accompanying the paper "DOP: Off-Policy Multi-Agent Decomposed Policy Gradients" (ICLR 2021, https://arxiv.org/abs/2007.12322)☆52Updated 2 years ago
- ☆43Updated 4 years ago
- ☆93Updated 4 years ago
- ☆84Updated 3 years ago
- ☆40Updated 3 years ago
- ☆27Updated 4 years ago
- Value-Decomposition Multi-Agent Actor-Critics☆40Updated 2 years ago
- Random network distillation on Montezuma's Revenge and Super Mario Bros.☆48Updated 2 years ago
- This is a framework for the research on multi-agent reinforcement learning and the implementation of the experiments in the paper titled …☆117Updated 4 months ago
- Deep Transformer Q-Networks for Partially Observable Reinforcement Learning☆160Updated 8 months ago