pickxiguapi/Clean-Offline-RLHF

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/pickxiguapi/Clean-Offline-RLHF)

pickxiguapi / Clean-Offline-RLHF

Offline RLHF codebase implementation for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024)

☆42

Alternatives and similar repositories for Clean-Offline-RLHF

Users that are interested in Clean-Offline-RLHF are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

pickxiguapi / Uni-RLHF-Platform
View on GitHub
Uni-RLHF platform for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024…
☆42Nov 20, 2024Updated last year
ZibinDong / AlignDiff-ICLR2024
View on GitHub
☆33Mar 10, 2024Updated 2 years ago
clvrai / idapt
View on GitHub
Policy Transfer across Visual and Dynamics Domain Gaps via Iterative Grounding (RSS 2021)
☆12Oct 22, 2021Updated 4 years ago
brentyi / transformer-exercises-jax
View on GitHub
☆18Apr 17, 2026Updated 3 months ago
Improbable-AI / harness-offline-rl
View on GitHub
Official implementation of Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory Reweighting
☆16Feb 14, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
KyunghyunLee / aes-rl
View on GitHub
☆17Dec 12, 2020Updated 5 years ago
frt03 / jax_dt
View on GitHub
Minimal Decision Transformer Implementation written in Jax (Flax).
☆18Aug 8, 2022Updated 3 years ago
baitingzbt / PEDA
View on GitHub
Scaling Pareto-Efficient Decision Making via Offline Multi-Objective RL, published in ICLR 2023
☆34Dec 7, 2024Updated last year
ethanluoyc / corax
View on GitHub
Corax: Core RL in JAX
☆41Feb 22, 2024Updated 2 years ago
aliang8 / varibad_jax
View on GitHub
☆10Jun 27, 2024Updated 2 years ago
LAMDA-RL / ACT
View on GitHub
Official code for ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning (AAAI'24)
☆17Feb 10, 2024Updated 2 years ago
tinkoff-ai / cnf
View on GitHub
Official implementation for "Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flows", NeurIPS 2022, O…
☆12Jan 31, 2023Updated 3 years ago
daniel-robotics / ros_python_pkg
View on GitHub
Template Catkin package for ROS-1 Noetic; Contains basic structure for creating rospy nodes
☆17Oct 21, 2022Updated 3 years ago
qlan3 / Jaxplorer
View on GitHub
Jaxplorer is a Jax reinforcement learning (RL) framework for exploring new ideas.
☆13Jul 19, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
avillaflor / SPLT-transformer
View on GitHub
☆18Jul 10, 2022Updated 4 years ago
QData / dmc_remastered
View on GitHub
A version of the DeepMind Control Suite with randomly generated graphics, for measuring visual generalization in continuous control.
☆20Oct 19, 2020Updated 5 years ago
nicob15 / Trajectory-Generation-Control-and-Safety-with-Denoising-Diffusion-Probabilistic-Models
View on GitHub
☆14Jun 29, 2023Updated 3 years ago
pokaxpoka / B_Pref
View on GitHub
☆54Nov 10, 2022Updated 3 years ago
AlexGoldie / learn-rl-algorithms
View on GitHub
Official implementation for "How Should We Meta-Learn Reinforcement Learning Algorithms?"
☆23Sep 7, 2025Updated 10 months ago
pcchenxi / LAPO-offlienRL
View on GitHub
☆16Apr 14, 2026Updated 3 months ago
typoverflow / WiseRL
View on GitHub
PyTorch implementations for Offline Preference-Based RL (PbRL) algorithms
☆21Mar 24, 2025Updated last year
IVtest-Lab / RISEE_dataset
View on GitHub
☆15Apr 25, 2026Updated 2 months ago
jhejna / cpl
View on GitHub
Code for Contrastive Preference Learning (CPL)
☆184Nov 22, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
IanYangChina / SI4RP-data
View on GitHub
☆17Updated this week
snu-mllab / DPPO
View on GitHub
Official implementation of "Direct Preference-based Policy Optimization without Reward Modeling" (NeurIPS 2023)
☆43Jul 20, 2024Updated 2 years ago
rll-research / finetune-vs-metarl
View on GitHub
☆14May 31, 2022Updated 4 years ago
jrobine / twm
View on GitHub
Transformer-based World Models
☆90Apr 4, 2023Updated 3 years ago
rll-research / teachable
View on GitHub
☆17Oct 12, 2023Updated 2 years ago
Howuhh / sac-n-jax
View on GitHub
Single-file SAC-N implementation on jax with flax and equinox. 10x faster than pytorch
☆56May 21, 2023Updated 3 years ago
thuml / SPOT
View on GitHub
Code release for "Supported Policy Optimization for Offline Reinforcement Learning" (NeurIPS 2022), https://arxiv.org/abs/2202.06239
☆22Jun 24, 2023Updated 3 years ago
csmile-1006 / PreferenceTransformer
View on GitHub
Preference Transformer: Modeling Human Preferences using Transformers for RL (ICLR2023 Accepted)
☆168Oct 15, 2023Updated 2 years ago
JunjieWang95 / attention-based-lane-changing
View on GitHub
☆12Mar 15, 2022Updated 4 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
mazpie / mastering-urlb
View on GitHub
[ICML 2023] Pre-train world model-based agents with different unsupervised strategies, fine-tune the agent's components selectively, and …
☆41Feb 27, 2024Updated 2 years ago
ksluck / Coadaptation
View on GitHub
Repository replicating the design- and behaviour-adaptation algorithm using reinforcement learning algorithm presented in the paper " Dat…
☆27Jul 20, 2022Updated 4 years ago
apexrl / GCRL-Collection
View on GitHub
This repo relates to the survey paper <Goal-Conditioned Reinforcement Learning: Problems and Solutions>. We collects widely used benchmar…
☆145May 10, 2023Updated 3 years ago
younggyoseo / trajectory_mcl
View on GitHub
Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning (NeurIPS 2020)
☆39Oct 27, 2020Updated 5 years ago
ZibinDong / cocos
View on GitHub
Official implementation of the paper "Conditioning Matters: Training Diffusion Policies is Faster Than You Think".
☆18May 19, 2025Updated last year
FuxiRL / DunkCityDynasty
View on GitHub
☆74Feb 4, 2024Updated 2 years ago
nik7273 / covid-pgmorl
View on GitHub
Multi-objective reinforcement learning for covid-19 control
☆12Aug 12, 2021Updated 4 years ago