ZhaolinGao/REBEL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ZhaolinGao/REBEL)

ZhaolinGao / REBEL

Reinforcement Learning via Regressing Relative Rewards

☆40

Alternatives and similar repositories for REBEL

Users that are interested in REBEL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Cornell-RL / drpo
View on GitHub
Dateset Reset Policy Optimization
☆30Apr 12, 2024Updated 2 years ago
ZhaolinGao / REFUEL
View on GitHub
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
☆25Oct 8, 2024Updated last year
yudasong / briee
View on GitHub
Representation Learning in RL
☆13Jun 1, 2022Updated 4 years ago
abaheti95 / LoL-RL
View on GitHub
Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients
☆26Sep 10, 2024Updated last year
qiwang067 / CoWorld
View on GitHub
[NeurIPS 2024] PyTorch code for the paper "Making Offline RL Online: Collaborative World Models for Offline Visual Reinforcement Learning…
☆28Oct 24, 2025Updated 9 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
haotiansun14 / BBox-Adapter
View on GitHub
Lightweight Adapting for Black-Box Large Language Models
☆26Feb 15, 2024Updated 2 years ago
ars22 / scaling-LLM-math-synthetic-data
View on GitHub
Code and data used in the paper: "Training on Incorrect Synthetic Data via RL Scales LLM Math Reasoning Eight-Fold"
☆32Jun 16, 2024Updated 2 years ago
microsoft / lightATAC
View on GitHub
A lightweight reimplementation of Adversarially Trained Actor Critic
☆19Mar 19, 2026Updated 4 months ago
cmu-l3 / neurips2024-inference-tutorial-code
View on GitHub
NeurIPS 2024 tutorial on LLM Inference
☆50Dec 10, 2024Updated last year
kyegomez / EvoVLM-JP
View on GitHub
Plug in & Play Pytorch Implementation of the paper: "Evolutionary Optimization of Model Merging Recipes" by Sakana AI
☆33Nov 11, 2024Updated last year
meta-pytorch / remat
View on GitHub
torch_remat fine-grained activation checkpointing API
☆15Updated this week
scottlogic-alex / prm800k-denorm
View on GitHub
Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format
☆27Jul 12, 2023Updated 3 years ago
ymetz / rlhfblender
View on GitHub
RLHF-Blender: A Configurable Interactive Interface for Learning from Diverse Human Feedback
☆14May 19, 2026Updated 2 months ago
drivendataorg / snomed-ct-entity-linking-runtime
View on GitHub
Runtime repository for the SNOMED CT Entity Linking challenge on DrivenData
☆14Mar 5, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
aravindr93 / robustRL
View on GitHub
Robust policy search algorithms which train on model ensembles
☆31Oct 26, 2016Updated 9 years ago
pyro-ppl / sandbox
View on GitHub
Pyro models and misc examples.
☆20May 10, 2021Updated 5 years ago
ImprintLab / SPA
View on GitHub
SPA: Efficient User-Preference Alignment against Uncertainty in Medical Image Segmentation (ICCV 2025)
☆16Sep 26, 2025Updated 10 months ago
ZiyiZhang27 / sdpo
View on GitHub
[IEEE TPAMI] Code for the paper "Aligning Few-Step Diffusion Models with Dense Reward Difference Learning"
☆22Feb 25, 2026Updated 5 months ago
kvablack / LLaVA-server
View on GitHub
☆22Oct 20, 2023Updated 2 years ago
huxiao09 / QPA
View on GitHub
☆13Sep 24, 2024Updated last year
Ailln / ACL2020-Paper-Code-Blog
View on GitHub
🗂 ACL2020 的论文、代码和博客
☆22Jul 7, 2020Updated 6 years ago
rll-research / rune
View on GitHub
Code for paper: Reward Uncertainty for Exploration in Preference-based Reinforcement Learning
☆15May 26, 2022Updated 4 years ago
facebookresearch / SIE
View on GitHub
Code for the paper Self-Supervised Learning of Split Invariant Equivariant Representations
☆32Sep 4, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
zhengzhi-1997 / LLM-TRSR
View on GitHub
☆16May 22, 2025Updated last year
Jiaxin-Pei / Potato-Prolific-Dataset
View on GitHub
☆17Jun 14, 2023Updated 3 years ago
thu-ml / LM-Calibration
View on GitHub
☆17May 31, 2023Updated 3 years ago
tedmoskovitz / ConstrainedRL4LMs
View on GitHub
A library for constrained RLHF.
☆13Feb 19, 2024Updated 2 years ago
google-deepmind / enn_acme
View on GitHub
☆30Aug 25, 2022Updated 3 years ago
AkideLiu / MiniCache
View on GitHub
☆14Sep 7, 2024Updated last year
agentica-project / verl
View on GitHub
☆17Mar 30, 2026Updated 4 months ago
FranxYao / Complexity-Based-Prompting
View on GitHub
Complexity Based Prompting for Multi-Step Reasoning
☆17Mar 10, 2023Updated 3 years ago
XueruiSu / Trust-Region-Preference-Approximation
View on GitHub
Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning
☆15Jun 28, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
jhejna / cpl
View on GitHub
Code for Contrastive Preference Learning (CPL)
☆184Nov 22, 2024Updated last year
typoverflow / WiseRL
View on GitHub
PyTorch implementations for Offline Preference-Based RL (PbRL) algorithms
☆21Mar 24, 2025Updated last year
davidireland-iso / LeNSE
View on GitHub
☆14Nov 26, 2022Updated 3 years ago
Silent-Zebra / twisted-smc-lm
View on GitHub
☆35Mar 27, 2025Updated last year
AlignInc / aligner-replication
View on GitHub
The reproduct of the paper - Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction
☆21May 29, 2024Updated 2 years ago
microsoft / MMLMCalibration
View on GitHub
Code for EMNLP 2022 Paper: On the Calibration of Massively Multilingual Language Models
☆15Jun 12, 2023Updated 3 years ago
LaurenceA / COMS30017_2021
View on GitHub
Computational Neuroscience 3rd year CS course at the University of Bristol
☆13Jul 19, 2022Updated 4 years ago