Bick95 / PPOLinks
Comprehensive Implementation of Proximal Policy Optimization
☆11Updated 4 years ago
Alternatives and similar repositories for PPO
Users that are interested in PPO are comparing it to the libraries listed below
Sorting:
- Code for the NeurIPS 2021 paper "Deep Bandits Show-Off: Simple and Efficient Exploration with Deep Networkst"☆14Updated 3 years ago
- Pytorch implementation of Soft Actor-Critic☆20Updated 5 years ago
- A novel parallel UCT algorithm with linear speedup and negligible performance loss.☆122Updated 4 years ago
- QuaRL is an open-source framework for systematically studying the effect of applying quantization to reinforcement learning algorithms.☆70Updated 2 years ago
- An implementation of MuZero in JAX.☆57Updated 2 years ago
- ☆131Updated last year
- Code for the paper "Batch size invariance for policy optimization"☆52Updated 2 years ago
- Mirror Descent Policy Optimization☆40Updated 4 years ago
- Simple gym environments for safety in Reinforcement Learning Research☆18Updated last year
- Efficient Exploration through Bayesian Deep-Q Networks.☆19Updated 3 years ago
- PyTorch implementation of "Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs", NeurIPS 2020☆45Updated 4 years ago
- ☆113Updated 5 years ago
- 🧶 Minimal PyTorch Soft Actor Critic (SAC) implementation☆38Updated 3 years ago
- ☆30Updated last year
- ☆18Updated 6 years ago
- Scaling scaling laws with board games.☆53Updated 2 years ago
- Optim4RL is a Jax framework of learning to optimize for reinforcement learning.☆26Updated 10 months ago
- Reproduction of AlphaTensor paper for 2x2 matrices☆16Updated last year
- Code to reproduce the experiments in The Mirage of Action-Dependent Baselines in Reinforcement Learning.☆17Updated 7 years ago
- A Jax/Stax implementation of the general meta learning paper: Oh, J., Hessel, M., Czarnecki, W.M., Xu, Z., van Hasselt, H.P., Singh, S. a…☆22Updated 4 years ago
- advantage actor-critic reinforcement learning for openai gym cartpole☆65Updated 8 years ago
- Open source demo for the paper Learning to Score Behaviors for Guided Policy Optimization☆24Updated 5 years ago
- Deep Q Networks☆88Updated 7 years ago
- Theory of Reinforcement Learning☆17Updated 4 years ago
- ☆47Updated 5 years ago
- Sample-Efficient Automated Deep Reinforcement Learning☆34Updated 4 years ago
- A list of papers regarding generalization in (deep) reinforcement learning☆152Updated 2 years ago
- Code for a model-based version of Constrained Policy Optimization☆11Updated 4 years ago
- Reinforcement Learning with Convex Constraints☆14Updated 3 years ago
- on-policy optimization baselines for deep reinforcement learning☆32Updated 5 years ago