lucidrains / ppo
An implementation of PPO in Pytorch
☆58Updated last week
Alternatives and similar repositories for ppo:
Users that are interested in ppo are comparing it to the libraries listed below
- CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL☆109Updated 5 months ago
- Proximal Policy Optimization (Continuous Version) in PyTorch.☆27Updated 3 years ago
- Reproduction of Dreamerv1 and v2 in pytorch for deepmind control suite☆35Updated 2 years ago
- Challenging Memory-based Deep Reinforcement Learning Agents☆93Updated 3 months ago
- ☆20Updated 8 months ago
- Extreme Q-Learning: Max Entropy RL without Entropy☆84Updated 2 years ago
- Pytorch Implementation of MuZero Unplugged for gym environment. This algorithm is capable of supporting a wide range of action and observ…☆27Updated 2 years ago
- PyTorch Implementation of Implicit Quantile Networks (IQN) for Distributional Reinforcement Learning with additional extensions like PER,…☆85Updated last year
- Deep Reinforcement Learning Framework done with PyTorch☆32Updated this week
- Accelerated replay buffers in JAX☆41Updated 2 years ago
- ☆73Updated 3 months ago
- JAX implementation of deep RL agents with resets from the paper "The Primacy Bias in Deep Reinforcement Learning"☆102Updated 2 years ago
- The Controllable Agent project trains RL Agents able to optimize any reward function specified in real time, without any further learning…☆61Updated last year
- Deep Reinforcement Learning by using Proximal Policy Optimization and Random Network Distillation in Tensorflow 2 and Pytorch with some e…☆50Updated 4 years ago
- Implementation of Soft Actor Critic and some of its improvements in Pytorch☆52Updated this week
- Benchmarking RL for POMDPs in Pure JAX [Code for "Structured State Space Models for In-Context Reinforcement Learning" (NeurIPS 2023)]☆94Updated last year
- Code of the paper: Debiasing Meta-Gradient Reinforcement Learning by Learning the Outer Value Function☆13Updated 2 years ago
- Code accompanying the paper "TiZero: Mastering Multi-Agent Football with Curriculum Learning and Self-Play" (AAMAS 2023) 足球游戏智能体☆51Updated last year
- Synthetic Experience Replay☆86Updated 8 months ago
- Learning to Modulate pre-trained Models in RL (Decision Transformer, LoRA, Fine-tuning)☆53Updated 4 months ago
- Clean baseline implementation of PPO using an episodic TransformerXL memory☆165Updated 8 months ago
- Plug-and-play hydra sweepers for the EA-based multifidelity method DEHB and several population-based training variations, all proven to e…☆73Updated last year
- Implementation of Trajectory Transformer with attention caching and batched beam search☆109Updated last year
- Code for TRANSDREAMER: REINFORCEMENT LEARNING WITH TRANSFORMER WORLD MODELS☆25Updated last year
- Generalized Decision Transformer for Offline Hindsight Information Matching (ICLR2022)☆67Updated 2 years ago
- Single-file SAC-N implementation on jax with flax and equinox. 10x faster than pytorch☆47Updated last year
- Jax implementation of Proximal Policy Optimization (PPO) specifically tuned for Procgen, with benchmarked results and saved model weights…☆53Updated 2 years ago
- ☆70Updated 4 months ago
- Code for the papers Hypernetworks in Meta-Reinforcement Learning (Beck et al., 2022) and Recurrent Hypernetworks are Surprisingly Strong …☆12Updated 6 months ago