A PyTorch implementation of the paper "Provably Efficient Online RLHF with One-Pass Reward Modeling". This repository provides a flexible and modular approach to Online Reinforcement Learning from Human Feedback (Online RLHF).
☆92Dec 13, 2025Updated 4 months ago
Alternatives and similar repositories for Online_RLHF
Users that are interested in Online_RLHF are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A pytorch implementation of the paper "TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Simi…☆350Dec 15, 2025Updated 4 months ago
- This is an implementation for Streaming Wavelet Module, which sequentially apply wavelet transform to a sequence of signal efficiently.☆122Jan 24, 2026Updated 2 months ago
- ☆16Jan 7, 2025Updated last year
- [ICLR2026] Official codebase for the paper "Presenting a Paper is an Art: Self-Improvement Aesthetic Agents for Academic Presentations"☆336Oct 14, 2025Updated 6 months ago
- Bitcoin-related functions implemented in pure JavaScript☆16May 16, 2025Updated 11 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- We introduce the Audio Logical Reasoning (ALR) dataset, consisting of 6,446 text-audio annotated samples specifically designed for comple…☆1,107Nov 26, 2025Updated 4 months ago
- ☆31Jan 27, 2026Updated 2 months ago
- wf-template 是一个多模块的Java微服务架构脚手架项目,旨在规范服务分层、快速搭建企业级微服务系统。通过高度解耦的模块设计与丰富的基础能力封装,助力研发团队高效开发、快速落地微服务项目。☆166Jul 2, 2025Updated 9 months ago
- [Neurips 2025] R-KV: Redundancy-aware KV Cache Compression for Reasoning Models☆1,191Oct 16, 2025Updated 6 months ago
- AppPlatform 是一个前沿的大模型应用工程,旨在通过集成的声明式编程和低代码配置工具,简化和优化大模型的训练与推理应用的开发过程。本工程为软件工程师和产品经理提供一个强大的、可扩展的环境,以支持从概念到部署的全流程 AI 应用开发。☆1,431Apr 8, 2026Updated last week
- Dataset approched by A Benchmark and Frequency Compression Method for Infrared Few-Shot Object Detection☆1,003Apr 3, 2025Updated last year
- Large-Scale Selfie Video Dataset (L-SVD): A Benchmark for Emotion Recognition☆306Aug 18, 2024Updated last year
- EvaLearn is a pioneering benchmark designed to evaluate large language models (LLMs) on their learning capability and efficiency in chall…☆432Jan 18, 2026Updated 3 months ago
- 绝区零(ZenlessZoneZero) 一键式自动化工具 | 零号空洞 | 每日任务 | 奖励签到 | 自动清体力☆56Oct 22, 2025Updated 5 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- 🐲 LLVM-based Kaleidoscope language compiler ✨ 基于 LLVM 的 Kaleidoscope 编译器☆12Dec 16, 2022Updated 3 years ago
- FIT: 企业级AI开发框架,提供多语言函数引擎(FIT)、流式编排引擎(WaterFlow)及Java生态的LangChain替代方案(FEL)。原生/Spring双模运行,支持插件热插拔与智能聚散部署,无缝统一大模型与业务系统。☆2,111Mar 13, 2026Updated last month
- Generative Neural Methods Based On Model Iteration☆373Mar 31, 2023Updated 3 years ago
- ☆249Jul 19, 2023Updated 2 years ago
- Next-generation AI+DeFi intelligent framework☆43Jan 22, 2025Updated last year
- 数字底座是一款面向大型政府、企业数字化转型,基于身份认证、组织架构、岗位职务、应用系统、资源角色、数据目录、安全控制等功能构建的统一且安全的管理支撑平台。数字底座基于三员管理模式,具备微服务、多租户、容器化和国产化,支持用户利用代码生成器快速构建自己的业务应用,同时可关联诸…☆2,583Updated this week
- Make Tanstack Start(with all modern pieces) works on cloudflare worker.☆22Apr 11, 2026Updated last week
- [CVPR 2025] Official implementation for "Empowering LLMs to Understand and Generate Complex Vector Graphics" https://arxiv.org/abs/2412.1…☆628May 22, 2025Updated 10 months ago
- ☆278Apr 29, 2025Updated 11 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Comprehensive Deep Learning Tutorial : From Zero To Hero☆547Aug 2, 2024Updated last year
- End-to-end MTCNN face detection and alignment workflow with reproducible PyTorch implementation.☆31Jan 11, 2026Updated 3 months ago
- ☆38Jan 2, 2025Updated last year
- X(Twitter)☆79Jan 29, 2025Updated last year
- [T-PAMI 2025] Official implementation for "SVGDreamer++: Advancing Editability and Diversity in Text-Guided SVG Generation" https://arxiv…☆448Dec 13, 2024Updated last year
- ☆370Apr 1, 2026Updated 2 weeks ago
- Fill your US Visa application in seconds, not hours.☆34Dec 18, 2025Updated 4 months ago
- A modern web application for the Melbourne University Ultimate Frisbee Club, built with Next.js 15, TypeScript, and Tailwind CSS. This pl…☆101Jul 28, 2025Updated 8 months ago
- Uncommon Objects in 3D dataset☆1,317Nov 13, 2025Updated 5 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A curated collection of AI+X papers published in Nature / Science / Cell / Lancet / Radiology and their flagship sub-journals☆136Oct 1, 2025Updated 6 months ago
- A Knowledge Base on Pre-made Dishes☆105Jun 15, 2025Updated 10 months ago
- ☆165Updated this week
- ☆16Aug 20, 2020Updated 5 years ago
- 电子邮件是一款简化的具备邮件服务器的企业邮箱,支持在将其他主流邮箱的邮件进行导入后自主控制邮件数据安全。电子邮件具备较为简洁的界面风格,以其简洁精确的功能和小巧安全的架构便于企业和政府根据业务要求进行二次开发。电子邮件需要依赖开源的数字底座进行人员岗位管控。☆367Mar 31, 2026Updated 2 weeks ago
- [CVPR' 26] Benchmarking PhD-Level Coding in 3D Geometric Computer Vision☆44Apr 1, 2026Updated 2 weeks ago
- Res-SAM Framework for GPR Underground Hazard Detection☆1,615Nov 15, 2025Updated 5 months ago