NVlabs/GDPO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NVlabs/GDPO)

NVlabs / GDPO

Official implementation of GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

☆487

Alternatives and similar repositories for GDPO

Users that are interested in GDPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

deepglint / DanQing
View on GitHub
The official repo for the DanQing dataset.
☆36Mar 25, 2026Updated 3 months ago
ssssmark / AesR1
View on GitHub
Unlocking the Essence of Beauty: Advanced Aesthetic Reasoning with Relative-Absolute Policy Optimization
☆27Jan 27, 2026Updated 5 months ago
TencentARC / DSR_Suite
View on GitHub
☆73Apr 21, 2026Updated 2 months ago
sail-sg / Video-Next-Event-Prediction
View on GitHub
☆28Aug 9, 2025Updated 11 months ago
yuleiqin / RAIF
View on GitHub
A Recipe for Building LLM Reasoners to Solve Complex Instructions
☆32Oct 9, 2025Updated 9 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
WooooDyy / BMMR
View on GitHub
Code and resources for the NeurIPS 2025 Paper "BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset" by Zhiheng X…
☆18Oct 14, 2025Updated 8 months ago
zlab-princeton / llm-pruning-collection
View on GitHub
A collection of various llm pruning implementations, training code for GPUs & TPUs, and evaluation script.
☆68Apr 20, 2026Updated 2 months ago
kkk-an / UltraIF
View on GitHub
Code of EMNLP 2025 paper 'UltraIF: Advancing Instruction Following from the Wild'.
☆21Apr 3, 2025Updated last year
MingLiiii / ThinkARM
View on GitHub
Schoenfeld’s Anatomy of Mathematical Reasoning by Language Models
☆27Dec 21, 2025Updated 6 months ago
bimsarapathiraja / refedit
View on GitHub
[ICCV 2025] Official Implementation of RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model for Referring …
☆20Jun 27, 2025Updated last year
mlvlab / DeepVideoR1
View on GitHub
[NeurIPS25] Official Implementation (Pytorch) of "DeepVideo-R1"
☆35Feb 22, 2026Updated 4 months ago
XavierJiezou / Face-MoGLE
View on GitHub
[TPAMI 2026] Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation
☆34Jul 2, 2026Updated last week
BytedanceDouyinContent / SAIL-VL2
View on GitHub
The SAIL-VL2 series model developed by the BytedanceDouyinContent Group
☆80Sep 18, 2025Updated 9 months ago
THU-KEG / PairJudgeRM
View on GitHub
☆15Apr 14, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
leroy9472 / InMind
View on GitHub
☆15Nov 18, 2025Updated 7 months ago
JingMog / THOR
View on GitHub
[ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".
☆33Feb 26, 2026Updated 4 months ago
sail-sg / Precision-RL
View on GitHub
Defeating the Training-Inference Mismatch via FP16
☆196Nov 14, 2025Updated 7 months ago
kakaoenterprise / OutFlip
View on GitHub
Implementation of the ACL Findings paper "OutFlip: Generating Examples for Unknown Intent Detection with Natural Language Attack"
☆10May 24, 2021Updated 5 years ago
rongyaofang / prism-bench
View on GitHub
This is the official repository for the paper "FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehe…
☆131Jan 29, 2026Updated 5 months ago
GradiusTwinbee / GLIS
View on GitHub
officical code for ECCV 2024 paper "Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection"
☆14Jul 4, 2024Updated 2 years ago
saibr / hypvl
View on GitHub
This repository is related to 'Intriguing Properties of Hyperbolic Embeddings in Vision-Language Models', published at TMLR (2024), https…
☆21Jul 5, 2024Updated 2 years ago
CSJianYang / Industrial-Coder
View on GitHub
☆115Mar 27, 2026Updated 3 months ago
xiaohangt / wd1
View on GitHub
Official Implementation of wd1
☆32Sep 25, 2025Updated 9 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
apoorv-ml / Transformers-Sensor-Fusion
View on GitHub
This repo holds trending techniques for sensor fusion task using Transformers
☆14Feb 21, 2023Updated 3 years ago
ViktorAxelsen / MemSkill
View on GitHub
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents
☆541May 23, 2026Updated last month
XueZeyue / Awesome-Visual-Generation-Alignment-Survey
View on GitHub
A survey for visual generation alignment
☆143Nov 9, 2025Updated 8 months ago
Trae1ounG / BuPO
View on GitHub
[arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies
☆60Feb 6, 2026Updated 5 months ago
DragonAura / EE_DA_OJ
View on GitHub
2022 秋季学期清华大学电子系数据与算法课程 OJ 参考解答
☆10Jun 18, 2023Updated 3 years ago
ali-vilab / TTS-VAR
View on GitHub
Test-time Scaling for VAR models
☆31Sep 19, 2025Updated 9 months ago
naver-ai / prolip
View on GitHub
☆58Aug 16, 2025Updated 10 months ago
gogoduan / GoT-R1
View on GitHub
[ICLR26] GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning
☆106Jan 27, 2026Updated 5 months ago
kts707 / camm
View on GitHub
CAMM: Building Category-Agnostic and Animatable 3D Models from Monocular Videos
☆14Jun 14, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Gen-Verse / Open-AgentRL
View on GitHub
RLAnything (ICML 2026) & AutoTool (ICML 2026), DemyAgent: Open-Source RL for LLMs and Agentic Scenarios
☆571Jun 12, 2026Updated 3 weeks ago
OmniForcing / OmniForcing
View on GitHub
Official implementation of "OmniForcing: Unleashing Real-time Joint Audio-Visual Generation"[arXiv:2603.11647]. OmniForcing is the first …
☆165Jun 14, 2026Updated 3 weeks ago
llllly26 / ComplexBench-Edit
View on GitHub
[ACMMM 2025] ComplexBench-Edit: Benchmarking Complex Instruction-Driven Image Editing via Compositional Dependencies
☆22Jun 20, 2025Updated last year
bigai-nlco / RuleReasoner
View on GitHub
[ICLR 2026] RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling
☆39Feb 25, 2026Updated 4 months ago
GaryGuTC / UniME-v2
View on GitHub
[AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"
☆74Dec 8, 2025Updated 7 months ago
X-PLUG / SocialBench
View on GitHub
RoleInteract: Evaluating the Social Interaction of Role-Playing Agents
☆70Oct 12, 2024Updated last year
CodeGoat24 / UnifiedReward
View on GitHub
Official implementation of UnifiedReward & [NeurIPS 2025] UnifiedReward-Think & UnifiedReward-Flex
☆790Jun 18, 2026Updated 3 weeks ago