Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"
☆31Jul 27, 2021Updated 4 years ago
Alternatives and similar repositories for learning-from-human-preferences
Users that are interested in learning-from-human-preferences are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"☆335Nov 29, 2021Updated 4 years ago
- Infer how suboptimal agents are suboptimal while planning, for example if they are hyperbolic time discounters.☆25Sep 26, 2020Updated 5 years ago
- Code and data for the paper "Understanding Hidden Context in Preference Learning: Consequences for RLHF"☆34Dec 14, 2023Updated 2 years ago
- [NeurIPS 2024] Official Implementation of Meta-DT☆53Oct 16, 2024Updated last year
- [ICLR 2024 Spotlight] Code for the paper "Decision ConvFormer: Local Filtering in MetaFormer is Sufficient for Decision Making"☆12Apr 22, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆21Jun 27, 2024Updated last year
- A simple 1d simulator for the "Neural-Lander" paper, ICRA 2019☆19Feb 18, 2023Updated 3 years ago
- Bayesian Inverse Reinforcement Learning with simple environments☆19May 17, 2022Updated 3 years ago
- Minimal example to apply Decision Transformer in Atari Pong☆15Feb 1, 2025Updated last year
- Official repository for paper "Conservative Offline Distributional Reinforcement Learning" (NeurIPS 2021)☆22Aug 1, 2021Updated 4 years ago
- Tools to Support OpenAtlas development☆13Jul 9, 2019Updated 6 years ago
- A multi-agent environment using Unity ML-Agents Toolkit☆10Dec 9, 2020Updated 5 years ago
- ☆37Apr 27, 2023Updated 2 years ago
- Implementing REINFORCE algorithm on Pong, Lunar Lander and Cartplot + Medium Article☆23Nov 24, 2020Updated 5 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Code to reproduce the experiments in The Mirage of Action-Dependent Baselines in Reinforcement Learning.☆17Aug 2, 2018Updated 7 years ago
- ☆10Feb 9, 2024Updated 2 years ago
- Reinforcement Learning Algorithms with Unity 3D Environments☆18Jul 15, 2019Updated 6 years ago
- Reward Learning by Simulating the Past☆46May 9, 2019Updated 6 years ago
- [ICANN 2022] ''An Improved Lightweight YOLOv5 Model Based on Attention Mechanism for Face Mask Detection'' Official Code☆10Feb 27, 2024Updated 2 years ago
- implementation of Advanced Encryption Standard (AES) Block Cipher