lucidrains / ppoLinks

An implementation of PPO in Pytorch

☆91

Alternatives and similar repositories for ppo

Users that are interested in ppo are comparing it to the libraries listed below

Sorting:

ollebompa / PGA-MAP-Elites
Repository for the PGA-MAP-Elites algorithm. PGA-MAP-Elites was developed to efficiently scale MAP-Elites to large genotypes and noisy d…
☆57Updated 3 years ago
lucidrains / SAC-pytorch
Implementation of Soft Actor Critic and some of its improvements in Pytorch
☆60Updated 5 months ago
vwxyzjn / cleanba
CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL
☆114Updated 10 months ago
Reytuag / transformerXL_PPO_JAX
☆81Updated 8 months ago
lucidrains / improving-transformers-world-model-for-rl
Implementation of the new SOTA for model based RL, from the paper "Improving Transformer World Models for Data-Efficient RL", in Pytorch
☆128Updated 2 months ago
DHDev0 / Muzero-unplugged
Pytorch Implementation of MuZero Unplugged for gym environment. This algorithm is capable of supporting a wide range of action and observ…
☆27Updated 3 weeks ago
facebookresearch / MRQ
MR.Q is a general-purpose model-free reinforcement learning algorithm.
☆105Updated 3 weeks ago
vmicheli / delta-iris
Efficient World Models with Context-Aware Tokenization. ICML 2024
☆105Updated 9 months ago
lowrollr / turbozero
fast + parallel AlphaZero in JAX
☆97Updated 6 months ago
luchris429 / popjaxrl
Benchmarking RL for POMDPs in Pure JAX [Code for "Structured State Space Models for In-Context Reinforcement Learning" (NeurIPS 2023)]
☆107Updated last year
facebookresearch / oni
Learn online intrinsic rewards from LLM feedback
☆41Updated 6 months ago
jacooba / hyper
Code for the papers Hypernetworks in Meta-Reinforcement Learning (Beck et al., 2022) and Recurrent Hypernetworks are Surprisingly Strong …
☆14Updated 11 months ago
MarcoMeter / endless-memory-gym
Challenging Memory-based Deep Reinforcement Learning Agents
☆101Updated 8 months ago
keraJLi / synthetic-gymnax
Drop-in environment replacements that make your RL algorithm train faster.
☆21Updated last year
AlexGoldie / rl-learned-optimization
Official Implementation of "Can Learned Optimization Make Reinforcement Learning Less Difficult"
☆25Updated 2 months ago
hr0nix / dejax
Accelerated replay buffers in JAX
☆41Updated 2 years ago
radarFudan / mamba-minimal-jax
☆31Updated 7 months ago
instadeepai / outer-value-function-meta-rl
Code of the paper: Debiasing Meta-Gradient Reinforcement Learning by Learning the Outer Value Function
☆13Updated 2 years ago
young-geng / mintext
Minimal but scalable implementation of large language models in JAX
☆35Updated last week
snu-mllab / DPPO
Official implementation of "Direct Preference-based Policy Optimization without Reward Modeling" (NeurIPS 2023)
☆42Updated 11 months ago
CarperAI / Algorithm-Distillation-RLHF
☆34Updated 2 years ago
ZhaolinGao / REBEL
Reinforcement Learning via Regressing Relative Rewards
☆34Updated 7 months ago
CLAIRE-Labo / EvoTune
Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.
☆103Updated this week
tinkoff-ai / ReBRAC
Author's implementation of ReBRAC, a minimalist improvement upon TD3+BC
☆55Updated last year
dunnolab / xland-minigrid-datasets
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning - - — ICLR 2025
☆75Updated 5 months ago
instadeepai / sebulba
🪐 The Sebulba architecture to scale reinforcement learning on Cloud TPUs in JAX
☆58Updated last year
chongyi-zheng / td_infonce
Implementations of Temporal Difference InfoNCE (TD InfoNCE)
☆29Updated last year
cgarciae / nanoGPT-jax
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆34Updated last year
tinkoff-ai / katakomba
Data-Driven NetHack Tools: Datasets (30+) and recurrent-baselines (AWAC, BC, CQL, IQL, REM)
☆74Updated 2 years ago
snu-mllab / Achievement-Distillation
Official PyTorch implementation of "Discovering Hierarchical Achievements in Reinforcement Learning via Contrastive Learning" (NeurIPS 20…
☆33Updated 4 months ago