Bick95 / PPO
Comprehensive Implementation of Proximal Policy Optimization
☆10Updated 3 years ago
Alternatives and similar repositories for PPO:
Users that are interested in PPO are comparing it to the libraries listed below
- An implementation of MuZero in JAX.☆55Updated 2 years ago
- QuaRL is an open-source framework for systematically studying the effect of applying quantization to reinforcement learning algorithms.☆66Updated last year
- A novel parallel UCT algorithm with linear speedup and negligible performance loss.☆115Updated 3 years ago
- A number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environm…☆39Updated 2 years ago
- Mirror Descent Policy Optimization☆38Updated 4 years ago
- Pytorch implementation of Soft Actor-Critic☆18Updated 4 years ago
- JAX code for the paper "Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation"☆43Updated 3 years ago
- ☆47Updated 4 years ago
- Official pytorch implementation for our ICLR 2023 paper "Latent State Marginalization as a Low-cost Approach for Improving Exploration".☆24Updated 2 years ago
- Scaling scaling laws with board games.☆48Updated last year
- Open source demo for the paper Learning to Score Behaviors for Guided Policy Optimization☆24Updated 4 years ago
- Accelerated replay buffers in JAX☆41Updated 2 years ago
- ☆65Updated 4 years ago
- Code for the paper "Batch size invariance for policy optimization"☆48Updated last year
- Code to reproduce the experiments in The Mirage of Action-Dependent Baselines in Reinforcement Learning.☆17Updated 6 years ago
- ☆15Updated 7 months ago
- Jax-Baseline is a Reinforcement Learning implementation using JAX and Flax/Haiku libraries, mirroring the functionality of Stable-Baselin…☆47Updated 3 weeks ago
- Plug-and-play hydra sweepers for the EA-based multifidelity method DEHB and several population-based training variations, all proven to e…☆74Updated last year
- Jax implementation of Proximal Policy Optimization (PPO) specifically tuned for Procgen, with benchmarked results and saved model weights…☆53Updated 2 years ago
- General Modules for JAX☆64Updated this week
- Code for the NeurIPS 2021 paper "Deep Bandits Show-Off: Simple and Efficient Exploration with Deep Networkst"☆14Updated 2 years ago
- ☆22Updated 3 years ago
- Benchmarking RL for POMDPs in Pure JAX [Code for "Structured State Space Models for In-Context Reinforcement Learning" (NeurIPS 2023)]☆96Updated last year
- Vectorization techniques for fast population-based training.☆55Updated 2 years ago
- RE3: State Entropy Maximization with Random Encoders for Efficient Exploration☆68Updated 3 years ago
- Source code for the Paper: CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints}☆72Updated 2 years ago
- Theory of Reinforcement Learning☆16Updated 3 years ago
- This code implements Prioritized Level Replay, a method for sampling training levels for reinforcement learning agents that exploits the …☆85Updated 3 years ago
- Pytorch implementation of LOLA (https://arxiv.org/abs/1709.04326) using DiCE (https://arxiv.org/abs/1802.05098)☆93Updated 6 years ago
- Reinforcement Learning Assembly☆92Updated 3 years ago