maitchison / PPO

Example implemention of the Proximal Policy Optimization algorithm

☆16

Alternatives and similar repositories for PPO

Users that are interested in PPO are comparing it to the libraries listed below

Sorting:

quantumiracle / MARS
MARS is shortened for Multi-Agent Research Studio, a library for mulit-agent reinforcement learning research.
☆48Updated last year
facebookresearch / off-belief-learning
Implementation of the Off Belief Learning algorithm.
☆47Updated 2 years ago
kristery / Elastic-DT
[NeurIPS 2023] Implementation of Elastic Decision Transformer
☆35Updated last year
HumanCompatibleAI / learning-from-human-preferences
Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"
☆29Updated 3 years ago
changchencc / TransDreamer
Code for TRANSDREAMER: REINFORCEMENT LEARNING WITH TRANSFORMER WORLD MODELS
☆25Updated last year
ReinholdM / Offline-Pre-trained-Multi-Agent-Decision-Transformer
☆111Updated 2 years ago
weipu-zhang / STORM
☆86Updated 11 months ago
mxu34 / prompt-dt
Official code repository for Prompt-DT.
☆109Updated 2 years ago
thu-rllab / CFCQL
Code for NeurIPS2023 accepted paper: Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement Learning.
☆36Updated 3 months ago
Howuhh / faster-trajectory-transformer
Implementation of Trajectory Transformer with attention caching and batched beam search
☆112Updated 2 years ago
k4ntz / OC_Atari
Object Centric Atari games
☆78Updated this week
jesbu1 / hidio
Github repo for HIDIO: Hierarchical Reinforcement Learning by Discovering Intrinsic Options
☆46Updated 3 years ago
rlglab / optionzero
[ICLR 2025 Oral] OptionZero: A method for autonomously discovering and utilizing options in the MuZero algorithm
☆14Updated 3 months ago
facebookresearch / MRQ
MR.Q is a general-purpose model-free reinforcement learning algorithm.
☆91Updated last month
NJU-RL / Meta-DT
☆24Updated 7 months ago
machelreid / can-wikipedia-help-offline-rl
Official code for "Can Wikipedia Help Offline Reinforcement Learning?" by Machel Reid, Yutaro Yamada and Shixiang Shane Gu
☆104Updated 2 years ago
keraJLi / synthetic-gymnax
Drop-in environment replacements that make your RL algorithm train faster.
☆20Updated 10 months ago
flowersteam / TeachMyAgent
TeachMyAgent is a testbed platform for Automatic Curriculum Learning methods in Deep RL.
☆74Updated last year
CILAB-MA / Machine_ToM
The Implementation of "Machine Theory of Mind", ICML 2018
☆24Updated 3 years ago
DHDev0 / Muzero-unplugged
Pytorch Implementation of MuZero Unplugged for gym environment. This algorithm is capable of supporting a wide range of action and observ…
☆27Updated 2 years ago
PKU-RL / AdaRefiner
AdaRefiner: Refining Decisions of Language Models with Adaptive Feedback (NAACL 2024)
☆15Updated 9 months ago
luchris429 / model-free-opponent-shaping
Code for Model-Free Opponent Shaping (ICML 2022)
☆18Updated 2 years ago
nuwuxian / RL-state_mask
☆13Updated last year
ml-jku / L2M
Learning to Modulate pre-trained Models in RL (Decision Transformer, LoRA, Fine-tuning)
☆57Updated 7 months ago
daniellawson9999 / online-decision-transformer
An unofficial implementation for online decision transformer
☆40Updated 2 years ago
yfletberliac / adversarially-guided-actor-critic
AGAC: Adversarially Guided Actor-Critic
☆49Updated 3 years ago
etaoxing / multigame-dt
Implementation of Multi-Game Decision Transformers in PyTorch
☆46Updated 2 years ago
quantumiracle / nash-dqn
Official code of Nash-DQN for paper: Nash-DQN algorithm for two-player zero-sum Markov games, details see our paper: A Deep Reinforcement…
☆20Updated 2 years ago
adityabingi / Dreamer
Reproduction of Dreamerv1 and v2 in pytorch for deepmind control suite
☆39Updated 2 years ago
facebookresearch / level-replay
This code implements Prioritized Level Replay, a method for sampling training levels for reinforcement learning agents that exploits the …
☆86Updated 3 years ago