ZinYY/Online_RLHF

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ZinYY/Online_RLHF)

ZinYY / Online_RLHF

A PyTorch implementation of the paper "Provably Efficient Online RLHF with One-Pass Reward Modeling". This repository provides a flexible and modular approach to Online Reinforcement Learning from Human Feedback (Online RLHF).

☆94

Alternatives and similar repositories for Online_RLHF

Users that are interested in Online_RLHF are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ZinYY / TreeLoRA
View on GitHub
A pytorch implementation of the paper "TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Simi…
☆350Dec 15, 2025Updated 6 months ago
ZinYY / StreamingWavelet
View on GitHub
This is an implementation for Streaming Wavelet Module, which sequentially apply wavelet transform to a sequence of signal efficiently.
☆123Jan 24, 2026Updated 5 months ago
lifirepot / Nos-Android7.0-Charles
View on GitHub
☆16Jan 7, 2025Updated last year
UCSB-AI / EvoPresent
View on GitHub
[ICLR2026] Official codebase for the paper "Presenting a Paper is an Art: Self-Improvement Aesthetic Agents for Academic Presentations"
☆343May 12, 2026Updated last month
ant-research / Drift
View on GitHub
Drift: DLM Reinforcement Learning Training Framework
☆258May 31, 2026Updated last month
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
fefergrgrgrg / bitcoinjs-lib
View on GitHub
Bitcoin-related functions implemented in pure JavaScript
☆16May 16, 2025Updated last year
xid32 / SoundMind
View on GitHub
We introduce the Audio Logical Reasoning (ALR) dataset, consisting of 6,446 text-audio annotated samples specifically designed for comple…
☆1,111Nov 26, 2025Updated 7 months ago
bean4896 / diablo4compassfarm
View on GitHub
☆33May 15, 2026Updated last month
zwm1005 / wf-template
View on GitHub
wf-template 是一个多模块的Java微服务架构脚手架项目，旨在规范服务分层、快速搭建企业级微服务系统。通过高度解耦的模块设计与丰富的基础能力封装，助力研发团队高效开发、快速落地微服务项目。
☆166Jul 2, 2025Updated last year
Zefan-Cai / R-KV
View on GitHub
[Neurips 2025] R-KV: Redundancy-aware KV Cache Compression for Reasoning Models
☆1,204Jun 23, 2026Updated 2 weeks ago
RuihengZhang / IFSOD-dataset
View on GitHub
Dataset approched by A Benchmark and Frequency Compression Method for Infrared Few-Shot Object Detection
☆1,005Apr 3, 2025Updated last year
ModelEngine-Group / app-platform
View on GitHub
AppPlatform 是一个前沿的大模型应用工程，旨在通过集成的声明式编程和低代码配置工具，简化和优化大模型的训练与推理应用的开发过程。本工程为软件工程师和产品经理提供一个强大的、可扩展的环境，以支持从概念到部署的全流程 AI 应用开发。
☆1,442May 18, 2026Updated last month
PeiranLi0930 / L-SVD
View on GitHub
Large-Scale Selfie Video Dataset (L-SVD): A Benchmark for Emotion Recognition
☆306Aug 18, 2024Updated last year
ByteDance-Seed / EvaLearn
View on GitHub
EvaLearn is a pioneering benchmark designed to evaluate large language models (LLMs) on their learning capability and efficiency in chall…
☆431May 12, 2026Updated last month
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
m0dulo / Kaleidoscope
View on GitHub
🐲 LLVM-based Kaleidoscope language compiler ✨ 基于 LLVM 的 Kaleidoscope 编译器
☆12Dec 16, 2022Updated 3 years ago
GelonStark / ZA-Lite
View on GitHub
绝区零（ZenlessZoneZero）一键式自动化工具 | 零号空洞 | 每日任务 | 奖励签到 | 自动清体力
☆57Oct 22, 2025Updated 8 months ago
ModelEngine-Group / fit-framework
View on GitHub
FIT: 企业级AI开发框架，提供多语言函数引擎（FIT）、流式编排引擎（WaterFlow）及Java生态的LangChain替代方案（FEL）。原生/Spring双模运行，支持插件热插拔与智能聚散部署，无缝统一大模型与业务系统。
☆2,107Mar 13, 2026Updated 3 months ago
elleryqueenhomels / fast_neural_style_transfer
View on GitHub
Generative Neural Methods Based On Model Iteration
☆373Mar 31, 2023Updated 3 years ago
WhalesIsland / Nextheon
View on GitHub
Next-generation AI+DeFi intelligent framework
☆43Jan 22, 2025Updated last year
PeiranLi0930 / TorchProject
View on GitHub
☆249Jul 19, 2023Updated 2 years ago
risesoft-y9 / Digital-Infrastructure
View on GitHub
数字底座是一款面向大型政府、企业数字化转型，基于身份认证、组织架构、岗位职务、应用系统、资源角色、数据目录、安全控制等功能构建的统一且安全的管理支撑平台。数字底座基于三员管理模式，具备微服务、多租户、容器化和国产化，支持用户利用代码生成器快速构建自己的业务应用，同时可关联诸…
☆2,597Jul 1, 2026Updated last week
cherishh / tanstack-start-template
View on GitHub
Make Tanstack Start(with all modern pieces) works on cloudflare worker.
☆23Apr 11, 2026Updated 2 months ago
osmanbeyoglulab / CITRUS
View on GitHub
☆22Oct 11, 2022Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
ximinng / LLM4SVG
View on GitHub
[CVPR 2025] Official implementation for "Empowering LLMs to Understand and Generate Complex Vector Graphics" https://arxiv.org/abs/2412.1…
☆648May 22, 2025Updated last year
lyanlin96 / Application-Security-Ingress-Controller
View on GitHub
☆277Apr 29, 2025Updated last year
PeiranLi0930 / Comprehensive_DL_Tutor
View on GitHub
Comprehensive Deep Learning Tutorial : From Zero To Hero
☆547Aug 2, 2024Updated last year
acelin1981 / mtcnn-face-detection-workflow
View on GitHub
End-to-end MTCNN face detection and alignment workflow with reproducible PyTorch implementation.
☆31Jan 11, 2026Updated 5 months ago
SoulPet / SoulPet
View on GitHub
X（Twitter）
☆79Jan 29, 2025Updated last year
ximinng / SVGDreamerV2
View on GitHub
[T-PAMI 2025] Official implementation for "SVGDreamer++: Advancing Editability and Diversity in Text-Guided SVG Generation" https://arxiv…
☆451Dec 13, 2024Updated last year
orchain / go-ethereum
View on GitHub
☆370Apr 1, 2026Updated 3 months ago
facebookresearch / uco3d
View on GitHub
Uncommon Objects in 3D dataset
☆1,338Nov 13, 2025Updated 7 months ago
liuyuelintop / melb-uni-ultimate
View on GitHub
A modern web application for the Melbourne University Ultimate Frisbee Club, built with Next.js 15, TypeScript, and Tailwind CSS. This pl…
☆103Jul 28, 2025Updated 11 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
keating666 / yzcbbs
View on GitHub
A Knowledge Base on Pre-made Dishes
☆105May 20, 2026Updated last month
meYQY / auto-ds160-filler
View on GitHub
Fill your US Visa application in seconds, not hours.
☆35Dec 18, 2025Updated 6 months ago
erics666 / F1_10_Code
View on GitHub
☆16Aug 20, 2020Updated 5 years ago
risesoft-y9 / Email
View on GitHub
电子邮件是一款简化的具备邮件服务器的企业邮箱，支持在将其他主流邮箱的邮件进行导入后自主控制邮件数据安全。电子邮件具备较为简洁的界面风格，以其简洁精确的功能和小巧安全的架构便于企业和政府根据业务要求进行二次开发。电子邮件需要依赖开源的数字底座进行人员岗位管控。
☆370Jun 23, 2026Updated 2 weeks ago
zhouxr6066 / Res-SAM
View on GitHub
Res-SAM Framework for GPR Underground Hazard Detection
☆1,620Jun 15, 2026Updated 3 weeks ago
Xiaoqi-Zhao-DLUT / Awesome-AI4X-NSCLR-Papers
View on GitHub
A curated collection of AI+X papers published in Nature / Science / Cell / Lancet / Radiology and their flagship sub-journals
☆137Oct 1, 2025Updated 9 months ago
wangxupeng / 2019Legal-AI-Challenge-Legal-Case-Element-Recognition-solution
View on GitHub
Completed this competition in collaboration with Jiang Yan(https://github.com/jy1993) and Guan Shuicheng(https://github.com/guanshuicheng…
☆365Nov 6, 2024Updated last year