openai / ppo-ewmaLinks

Code for the paper "Batch size invariance for policy optimization"

☆51

Alternatives and similar repositories for ppo-ewma

Users that are interested in ppo-ewma are comparing it to the libraries listed below

Sorting:

evgenii-nikishin / rl_with_resets
JAX implementation of deep RL agents with resets from the paper "The Primacy Bias in Deep Reinforcement Learning"
☆100Updated 3 years ago
facebookresearch / level-replay
This code implements Prioritized Level Replay, a method for sampling training levels for reinforcement learning agents that exploits the …
☆88Updated 4 years ago
johanobandoc / revisiting_rainbow
Revisiting Rainbow
☆75Updated 4 years ago
rraileanu / idaac
☆54Updated last year
sahandrez / homomorphic_policy_gradient
Author's PyTorch Implementation of Deep Homomorphic Policy Gradient (DHPG) - NeurIPS 2022 and JMLR 2024
☆23Updated last year
tedmoskovitz / TOP
Implementation of Tactical Optimistic and Pessimistic value estimation
☆25Updated 2 years ago
RajGhugare19 / alm
Simplifying Model-based RL: Learning Representations, Latent-space Models and Policies with One Objective
☆81Updated 2 years ago
salesforce / sibling-rivalry
Code for Sibling Rivalry and experiments presented in associated paper
☆18Updated 3 months ago
mila-iqia / spr
Code for "Data-Efficient Reinforcement Learning with Self-Predictive Representations"
☆161Updated 3 years ago
YYCAAA / V-MPO_Lunarlander
Simple implementation of V-MPO proposed in https://arxiv.org/abs/1909.12238
☆48Updated 4 years ago
yifan12wu / rl-laplacian
Learning Laplacian Representations in Reinforcement Learning
☆17Updated 4 years ago
frt03 / generalized_dt
Generalized Decision Transformer for Offline Hindsight Information Matching (ICLR2022)
☆67Updated 3 years ago
kzl / lifelong_rl
Pytorch implementations of RL algorithms, focusing on model-based, lifelong, reset-free, and offline algorithms. Official codebase for Re…
☆107Updated 3 years ago
younggyoseo / RE3
RE3: State Entropy Maximization with Random Encoders for Efficient Exploration
☆69Updated 4 years ago
jsikyoon / V-MPO_torch
V-MPO torch version with DMLab30 and GTrXL
☆13Updated 4 years ago
denisyarats / exorl
ExORL: Exploratory Data for Offline Reinforcement Learning
☆115Updated 3 years ago
toshikwa / rljax
A collection of RL algorithms written in JAX.
☆102Updated 3 years ago
lanyavik / BAIL
☆17Updated 3 years ago
scottemmons / rvs
Reinforcement Learning via Supervised Learning
☆71Updated 3 years ago
montrealrobotics / iv_rl
IV-RL - Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation
☆40Updated 3 weeks ago
deep-skill-chaining / deep-skill-chaining
Implementation of the skill discovery algorithm described in ICLR submission "Option Discovery using Deep Skill Chaining"
☆29Updated 5 years ago
sfujim / LAP-PAL
Author's PyTorch implementation of LAP and PAL with TD3 and DDQN
☆36Updated 3 years ago
ahmed-touati / controllable_agent
☆48Updated 2 years ago
yfletberliac / adversarially-guided-actor-critic
AGAC: Adversarially Guided Actor-Critic
☆48Updated 3 years ago
RLAgent / state-marginal-matching
Efficient Exploration via State Marginal Matching (2019)
☆69Updated 6 years ago
younggyoseo / CaDM
CaDM: Context-aware Dynamics Model for Generalization in Model-based Reinforcement Learning
☆63Updated 5 years ago
openai / phasic-policy-gradient
Code for the paper "Phasic Policy Gradient"
☆262Updated 2 years ago
alirezakazemipour / PPO-RND
Random network distillation on Montezuma's Revenge and Super Mario Bros.
☆51Updated 2 months ago
jparkerholder / DvD_ES
Code from the paper "Effective Diversity in Population Based Reinforcement Learning", presented as a spotlight at NeurIPS 2020. This is t…
☆44Updated 4 years ago
vwxyzjn / cleanba
CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL
☆114Updated 11 months ago