mrahtz / learning-from-human-preferencesLinks
Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"
☆327Updated 3 years ago
Alternatives and similar repositories for learning-from-human-preferences
Users that are interested in learning-from-human-preferences are comparing it to the libraries listed below
Sorting:
- A basic 2D maze environment where an agent start from the top left corner and try to find its way to the bottom left corner.☆371Updated last year
- A simple framework for experimenting with Reinforcement Learning in Python.☆318Updated last year
- Tensorflow/Keras code and trained models for Episodic Curiosity Through Reachability☆205Updated 4 years ago
- A Python interface for reinforcement learning environments☆372Updated 2 years ago
- Deep reinforcement learning model implementation in Tensorflow + OpenAI gym☆300Updated 2 years ago
- World Models Experiments☆656Updated 2 years ago
- Reimplementation of World-Models (Ha and Schmidhuber 2018) in pytorch☆626Updated 3 years ago
- Basic versions of agents from Spinning Up in Deep RL written in PyTorch☆206Updated 4 years ago
- A customizable framework to create maze and gridworld environments☆268Updated 6 years ago
- ☆303Updated 2 years ago
- Deep Planning Network: Control from pixels by latent planning with learned dynamics☆370Updated 3 years ago
- Offline Reinforcement Learning (aka Batch Reinforcement Learning) on Atari 2600 games☆553Updated 2 years ago
- Dream to Control: Learning Behaviors by Latent Imagination☆548Updated 3 years ago
- ☆137Updated 7 years ago
- Reinforcement Learning with Deep Energy-Based Policies☆429Updated last year
- Code for the paper "Quantifying Transfer in Reinforcement Learning"☆398Updated last year
- RAD: Reinforcement Learning with Augmented Data☆409Updated 4 years ago
- For educational materials related to the spinning up workshops.☆202Updated 6 years ago
- Code for Go-Explore: a New Approach for Hard-Exploration Problems☆574Updated 2 years ago
- Multi Agent Reinforcement Learning using MalmÖ☆258Updated 5 years ago
- Code for the paper "Phasic Policy Gradient"☆262Updated 2 years ago
- Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO☆205Updated 2 years ago
- Trust Region Policy Optimization with TensorFlow and OpenAI Gym☆360Updated 5 years ago
- Real-World RL Benchmark Suite☆355Updated 5 years ago
- Clone of OpenAI's Spinning Up in PyTorch☆152Updated 3 years ago
- Keeping track of RL experiments☆163Updated 2 years ago
- Structural implementation of RL key algorithms☆514Updated 2 years ago
- Code for the paper "When to Trust Your Model: Model-Based Policy Optimization"☆508Updated 2 years ago
- A high-performance Atari A3C agent in 180 lines of PyTorch☆171Updated 4 years ago
- Paired Open-Ended Trailblazer (POET) and Enhanced POET☆254Updated 3 years ago