HumanCompatibleAI / interpreting-rewards
Experiments in applying interpretability techniques to learned reward functions.
☆9Updated 4 years ago
Alternatives and similar repositories for interpreting-rewards:
Users that are interested in interpreting-rewards are comparing it to the libraries listed below
- Library to compare and evaluate reward functions☆66Updated last year
- Benchmark environments for reward modelling and imitation learning algorithms.☆46Updated last year
- Reward Learning by Simulating the Past☆44Updated 5 years ago
- AGAC: Adversarially Guided Actor-Critic☆48Updated 3 years ago
- Deep Reinforcement Learning algorithms implemented in PyTorch☆49Updated 6 years ago
- Infer how suboptimal agents are suboptimal while planning, for example if they are hyperbolic time discounters.☆23Updated 4 years ago
- ☆21Updated 4 years ago
- Neurosymbolic transformers for multi-agent communication.☆21Updated 4 years ago
- Estimating Q(s,s') with Deep Deterministic Dynamics Gradients☆32Updated 5 years ago
- (Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards☆27Updated 5 years ago
- Gym wrapper for pysc2☆10Updated 2 years ago
- Code for Optimistic Exploration even with a Pessimistic Initialisation☆14Updated 4 years ago
- This repository contains code for the method and experiments of the paper "Learning with AMIGo: Adversarially Motivated Intrinsic Goals".☆61Updated last year
- TeachMyAgent is a testbed platform for Automatic Curriculum Learning methods in Deep RL.☆68Updated last year
- A collection of RL algorithms written in JAX.☆95Updated 2 years ago
- ☆44Updated 6 years ago
- Jax implementation of Proximal Policy Optimization (PPO) specifically tuned for Procgen, with benchmarked results and saved model weights…☆53Updated 2 years ago
- Learning Action-Value Gradients in Model-based Policy Optimization☆31Updated 3 years ago
- Hierarchical Self-Play☆21Updated 6 years ago
- Collection of reinforcement learning algorithms☆15Updated 3 years ago
- Revisiting Rainbow☆74Updated 3 years ago
- Reinforcement Learning via Latent State Decoding☆30Updated last year
- Easy MDPs and grid worlds with accessible transition dynamics to do exact calculations☆48Updated 2 years ago
- mplementation of Advantage Actor Critic (A2C) and Proximal Policy Optimization Algorithm (PPO) use the advantages of Tensorflow 2.x.☆9Updated 4 years ago
- Implementation of the Model-Based Meta-Policy-Optimization (MB-MPO) algorithm☆44Updated 6 years ago
- Reinforcement Learning with Latent Flow☆43Updated 3 years ago
- hierarchical deep reinforcement learning algorithms☆41Updated 7 years ago
- Efficient Exploration via State Marginal Matching (2019)☆67Updated 5 years ago
- Tensorflow 2 source code for the PI-SAC agent from "Predictive Information Accelerates Learning in RL" (NeurIPS 2020)☆44Updated last year
- Code for paper "Successor Uncertainties: Exploration and Uncertainty in Temporal Difference Learning" by David Janz*, Jiri Hron*, Przemys…☆20Updated last year