HumanCompatibleAI / learning-from-human-preferencesLinks

Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"

☆31

Alternatives and similar repositories for learning-from-human-preferences

Users that are interested in learning-from-human-preferences are comparing it to the libraries listed below

Sorting:

Stanford-ILIAD / APReL
A Library for Active Preference-based Reward Learning Algorithms
☆52Updated last year
hari-sikchi / LOOP
Learning Off-Policy with Online Planning [CoRL 2021 Best Paper Finalist]
☆40Updated 2 years ago
TakuyaHiraoka / Dropout-Q-Functions-for-Doubly-Efficient-Reinforcement-Learning
Source files to replicate experiments in my ICLR 2022 paper.
☆70Updated 3 weeks ago
jhejna / few-shot-preference-rl
☆35Updated 2 years ago
flowersteam / TeachMyAgent
TeachMyAgent is a testbed platform for Automatic Curriculum Learning methods in Deep RL.
☆76Updated last year
jesbu1 / hidio
Github repo for HIDIO: Hierarchical Reinforcement Learning by Discovering Intrinsic Options
☆46Updated 3 years ago
jakegrigsby / super_sac
A general model-free off-policy actor-critic implementation. Continuous and Discrete Soft Actor-Critic with multimodal observations, data…
☆38Updated last year
rmst / rlrd
PyTorch implementation of our paper Reinforcement Learning with Random Delays (ICLR 2020)
☆41Updated 3 years ago
martius-lab / HiTS
Code for the paper: Hierarchical Reinforcement Learning With Timed Subgoals, published at NeurIPS 2021
☆34Updated 3 years ago
montaserFath / BCO
behavior cloning from observation
☆36Updated 4 years ago
daniellawson9999 / online-decision-transformer
An unofficial implementation for online decision transformer
☆40Updated 2 years ago
Manchery / iql-pytorch
Unofficial PyTorch implementation (replicating paper results) of Implicit Q-Learning (In-sample Q-Learning) for offline RL
☆23Updated 9 months ago
uoe-agents / derl
The official repository of Decoupled Reinforcement Learning to Stabilise Intrinsically-Motivated Exploration" (AAMAS 2022)
☆27Updated 3 years ago
ZhengyaoJiang / latentplan
Code release for Efficient Planning in a Compact Latent Action Space (ICLR2023) https://arxiv.org/abs/2208.10291.
☆109Updated 2 years ago
intelligent-control-lab / guard
☆53Updated 6 months ago
twni2016 / Memory-RL
When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment, NeurIPS 2023 (oral)
☆63Updated last year
denisyarats / exorl
ExORL: Exploratory Data for Offline Reinforcement Learning
☆115Updated 3 years ago
montrealrobotics / iv_rl
IV-RL - Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation
☆40Updated 3 weeks ago
architsharma97 / earl_benchmark
EARL: Environment for Autonomous Reinforcement Learning
☆37Updated 2 years ago
ml-jku / L2M
Learning to Modulate pre-trained Models in RL (Decision Transformer, LoRA, Fine-tuning)
☆59Updated 10 months ago
knowledgetechnologyuhh / goal_conditioned_RL_baselines
☆25Updated 2 years ago
JasonMa2016 / CODAC
Official repository for paper "Conservative Offline Distributional Reinforcement Learning" (NeurIPS 2021)
☆21Updated 4 years ago
Stanford-ILIAD / ELLA
Reward shaping approach for instruction following settings, leveraging language at multiple levels of abstraction.
☆21Updated 4 years ago
JasonMa2016 / GoFAR
Official repository for Paper "Offline Goal-Conditioned Reinforcement Learning via f-Advantage Regression" (NeurIPS 2022)
☆35Updated last year
twni2016 / f-IRL
Inverse Reinforcement Learning via State Marginal Matching, CoRL 2020
☆45Updated 2 years ago
DesikRengarajan / LOGO
[ICLR 2022 Spotlight] Code for Reinforcement Learning with Sparse Rewards using Guidance from Offline Demonstration
☆28Updated 3 years ago
rmrafailov / LOMPO
Official Codebase for Offline Reinforcement Learning from Images with Latent Space Models
☆30Updated 4 years ago
kzl / lifelong_rl
Pytorch implementations of RL algorithms, focusing on model-based, lifelong, reset-free, and offline algorithms. Official codebase for Re…
☆107Updated 3 years ago
philippe-eecs / IDQL
Repo for Implicit Diffusion Q-Learning
☆112Updated last year
Improbable-AI / eipo
Official codebase for Redeeming Intrinsic Rewards via Constrained Policy Optimization
☆82Updated 2 years ago