Bick95 / PPO
Comprehensive Implementation of Proximal Policy Optimization
☆10Updated 3 years ago
Alternatives and similar repositories for PPO
Users that are interested in PPO are comparing it to the libraries listed below
Sorting:
- Code for the NeurIPS 2021 paper "Deep Bandits Show-Off: Simple and Efficient Exploration with Deep Networkst"☆14Updated 2 years ago
- A Jax/Stax implementation of the general meta learning paper: Oh, J., Hessel, M., Czarnecki, W.M., Xu, Z., van Hasselt, H.P., Singh, S. a…☆21Updated 4 years ago
- QuaRL is an open-source framework for systematically studying the effect of applying quantization to reinforcement learning algorithms.☆69Updated 2 years ago
- An implementation of MuZero in JAX.☆56Updated 2 years ago
- JAX code for the paper "Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation"☆43Updated 3 years ago
- This code implements Prioritized Level Replay, a method for sampling training levels for reinforcement learning agents that exploits the …☆86Updated 3 years ago
- ☆43Updated 8 years ago
- Pytorch implementation of "FeUdal Networks for Hierarchical Reinforcement Learning" for Montezuma's Revenge☆94Updated 2 years ago
- Code implementing the CORE-RL algorithm with DDPG, PPO, and TRPO. See the paper "Control Regularization for Reduced Variance Reinforcemen…☆32Updated 4 years ago
- This repository contains code for the method and experiments of the paper "Learning with AMIGo: Adversarially Motivated Intrinsic Goals".☆61Updated last year
- Reinforcement Learning Assembly☆92Updated 3 years ago
- A novel parallel UCT algorithm with linear speedup and negligible performance loss.☆118Updated 4 years ago
- A3C style Option-Critic with deliberation cost☆39Updated 7 years ago
- Representation Learning in RL☆16Updated 2 years ago
- ☆65Updated last year
- rlcourse-march-17-hugobb created by GitHub Classroom☆16Updated 10 months ago
- Open source demo for the paper Learning to Score Behaviors for Guided Policy Optimization☆24Updated 4 years ago
- Code to reproduce the experiments in The Mirage of Action-Dependent Baselines in Reinforcement Learning.☆17Updated 6 years ago
- Implementation of Truncated Quantile Critics method for continuous reinforcement learning.☆25Updated 2 years ago
- ☆24Updated 2 years ago
- on-policy optimization baselines for deep reinforcement learning☆30Updated 5 years ago
- Mirror Descent Policy Optimization☆38Updated 4 years ago
- CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL☆112Updated 8 months ago
- Implementation of the Model-Based Meta-Policy-Optimization (MB-MPO) algorithm☆44Updated 6 years ago
- ☆111Updated 5 years ago
- Soft Actor-Critic☆145Updated 7 years ago
- Pytorch implementation of LOLA (https://arxiv.org/abs/1709.04326) using DiCE (https://arxiv.org/abs/1802.05098)☆95Updated 6 years ago
- ☆30Updated last year
- General Modules for JAX☆65Updated last month
- Implicit Normalizing Flows + Reinforcement Learning☆61Updated 5 years ago