HumanCompatibleAI / rlspLinks
Reward Learning by Simulating the Past
☆44Updated 6 years ago
Alternatives and similar repositories for rlsp
Users that are interested in rlsp are comparing it to the libraries listed below
Sorting:
- This repository contains code for the method and experiments of the paper "Learning with AMIGo: Adversarially Motivated Intrinsic Goals".☆61Updated last year
- ☆44Updated 6 years ago
- ☆80Updated last year
- Code for reproducing experiments in Model-Based Active Exploration, ICML 2019☆78Updated 5 years ago
- Deep Reinforcement Learning algorithms implemented in PyTorch☆49Updated 6 years ago
- Inferring beliefs about dynamics from behavior☆29Updated 7 years ago
- Reinforcement Learning papers on exploration methods.☆19Updated 3 years ago
- BabyAI++: Towards Grounded language Learning beyond Memorization, ICLR BeTR-RL 2020☆26Updated 4 years ago
- [ICML 2019] TensorFlow Code for Self-Supervised Exploration via Disagreement☆124Updated 5 years ago
- Solving reinforcement learning tasks which require language and vision☆32Updated 2 years ago
- Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees☆93Updated 5 years ago
- JAX code for the paper "Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation"☆43Updated 3 years ago
- On the pitfalls of measuring emergent communication☆34Updated 6 years ago
- Baselines and memory-based scenarios for the ViZDoom simulator☆34Updated 2 years ago
- E2C implementation in PyTorch☆43Updated 7 years ago
- Variational Reinforcement Learning☆16Updated 10 months ago
- RL framework for embodied agents based on PyTorch☆12Updated 6 years ago
- CLEVR-Robot: a reinforcement learning environment combining vision, language and control.☆134Updated 10 months ago
- Code repository for On the interaction between supervision and self-play in emergent communication (ICLR 2020)☆16Updated 5 years ago
- Estimating Q(s,s') with Deep Deterministic Dynamics Gradients☆32Updated 5 years ago
- mplementation of Advantage Actor Critic (A2C) and Proximal Policy Optimization Algorithm (PPO) use the advantages of Tensorflow 2.x.☆9Updated 5 years ago
- Automatic Data-Regularized Actor-Critic (Auto-DrAC)☆103Updated 2 years ago
- Official code for the paper "Learning Transition Policies for Composing Complex Skills" (ICLR 2019)☆73Updated 6 years ago
- Maximum Entropy-Regularized Multi-Goal Reinforcement Learning (ICML 2019)☆23Updated 6 years ago
- Generalised UDRL☆37Updated 3 years ago
- Code for ICLR 2019 paper Learning Dynamics Model by Incorporating the Long Term Future☆50Updated 5 years ago
- Code accompanying the OptionGAN paper.☆44Updated 6 years ago
- Code for experimenting with state and action abstractions in reinforcement learning.☆31Updated 4 years ago
- A collection of reading material for the Workshop on "Structure & Priors in Reinforcement Learning" (SPiRL) at ICLR 2019.☆13Updated 4 years ago
- Infer how suboptimal agents are suboptimal while planning, for example if they are hyperbolic time discounters.☆25Updated 4 years ago