callsys/GMPO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/callsys/GMPO)

callsys / GMPO

[ICLR 2026] Geometric-Mean Policy Optimization

☆104

Alternatives and similar repositories for GMPO

Users that are interested in GMPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mingrui-wu / OSI-Bench
View on GitHub
Official repo of From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs
☆24Jun 23, 2026Updated last month
MzeroMiko / XDLM
View on GitHub
[ICML 2026 Spotlight] Code for miXed Discrete Diffusion Language Model
☆27Mar 16, 2026Updated 4 months ago
YWenxi / think-with-images-through-self-calling
View on GitHub
official repo for `thinking with images through-self-calling`
☆26Dec 28, 2025Updated 6 months ago
callsys / ControlCap
View on GitHub
[ECCV 2024] ControlCap: Controllable Region-level Captioning
☆81Oct 25, 2024Updated last year
callsys / DynRefer
View on GitHub
[CVPR 2025] DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution
☆59Mar 4, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
callsys / GenPromp
View on GitHub
[ICCV 2023] Generative Prompt Model for Weakly Supervised Object Localization
☆57Nov 10, 2023Updated 2 years ago
AMAP-ML / GPG
View on GitHub
[ICLR26]GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning
☆179Jan 29, 2026Updated 5 months ago
wizard-III / ArcherCodeR
View on GitHub
ArcherCodeR is an open-source initiative enhancing code reasoning in large language models through scalable, rule-governed reinforcement …
☆44Aug 6, 2025Updated 11 months ago
shengliu66 / FractionalReason
View on GitHub
Official github repo for "Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute"
☆17Jun 30, 2025Updated last year
Kwai-Klear / RLEP
View on GitHub
RL with Experience Replay
☆59Jul 27, 2025Updated 11 months ago
martian422 / MaskGRPO
View on GitHub
The official implementation of MaskGRPO: Consolidating Reinforcement Learning for Multimodal Discrete Diffusion Models. (ICLR 2026, arxiv…
☆19Jan 27, 2026Updated 5 months ago
AZZMM / CC-Diff
View on GitHub
Implementation of paper "CC-Diff: Enhancing Contextual Coherence in Remote Sensing Image Synthesis"
☆28Dec 19, 2025Updated 7 months ago
AgenticIR-Lab / OThink-R1
View on GitHub
This is the official code for OThink-R1 project.
☆21Jun 19, 2025Updated last year
Qwen-Applications / CLIPO
View on GitHub
CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR
☆21Apr 7, 2026Updated 3 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Optimization-AI / DisCO
View on GitHub
NeurIPS 2025: Discriminative Constrained Optimization for Reinforcing Large Reasoning Models
☆53Mar 14, 2026Updated 4 months ago
callsys / FlowText
View on GitHub
[ICME 2023] FlowText: Synthesizing Realistic Scene Text Video with Optical Flow Estimation
☆13May 13, 2023Updated 3 years ago
wizard-III / Archer2.0
View on GitHub
Archer2.0 evolves from its predecessor by introducing ASPO, which overcomes fundamental PPO-Clip limitations to prevent premature converg…
☆31Oct 10, 2025Updated 9 months ago
OpenBMB / RLPR
View on GitHub
Extrapolating RLVR to General Domains without Verifiers
☆205Aug 12, 2025Updated 11 months ago
sail-sg / understand-r1-zero
View on GitHub
Understanding R1-Zero-Like Training: A Critical Perspective
☆1,268Aug 27, 2025Updated 10 months ago
qiujihao19 / Artemis
View on GitHub
[NeurIPS 2024] Artemis: Towards Referential Understanding in Complex Videos
☆27Apr 8, 2025Updated last year
qiuzh20 / RMoE
View on GitHub
Official implementation of RMoE (Layerwise Recurrent Router for Mixture-of-Experts)
☆33Aug 4, 2024Updated last year
yushuiwx / MH-MoE
View on GitHub
☆20Nov 5, 2024Updated last year
yinyueqin / DenseRewardRLHF-PPO
View on GitHub
This repository contains the code and released models for the paper Segmenting Text and Learning Their Rewards for Improved RLHF in Langu…
☆19Jan 8, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
MasterVito / DAC-RL
View on GitHub
Official Repo for DAC-RL: Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability
☆16Feb 26, 2026Updated 4 months ago
qiujihao19 / LongVideo-R1
View on GitHub
[CVPR 2026] LongVideo-R1: Smart Navigation for Low-cost Long Video Understanding
☆50Jul 7, 2026Updated 2 weeks ago
zhaoyangwei123 / SAPNet
View on GitHub
CVPR2024, Semantic-aware SAM for Point-Prompted Instance Segmentation
☆38Jan 20, 2025Updated last year
RUCAIBox / Passk_Training
View on GitHub
The official repository of paper "Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models''
☆113Aug 15, 2025Updated 11 months ago
inclusionAI / Ring-V2
View on GitHub
Ring-V2 is a reasoning MoE LLM provided and open-sourced by InclusionAI.
☆98Oct 23, 2025Updated 9 months ago
Leey21 / A-Data-Centric-Study
View on GitHub
☆18Mar 2, 2026Updated 4 months ago
THUDM / TreeRL
View on GitHub
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25
☆97Jun 16, 2025Updated last year
chiefovoavicii / MAD-OPD
View on GitHub
Official code for "Breaking the Ceiling in On-Policy Distillation via Multi-Agent Debate" (arXiv:2605.01347).
☆32May 7, 2026Updated 2 months ago
sail-sg / VeriFree
View on GitHub
Reinforcing General Reasoning without Verifiers
☆102Jun 24, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
suu990901 / KlearReasoner
View on GitHub
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
☆82Dec 25, 2025Updated 7 months ago
SalesforceAIResearch / Elastic-Reasoning
View on GitHub
Make reasoning models scalable
☆51Jun 2, 2026Updated last month
MingyuJ666 / Rope_with_LLM
View on GitHub
[ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concen…
☆87Jun 20, 2025Updated last year
Alibaba-Quark / SSP
View on GitHub
Search Self-Play: Pushing the Frontier of Agent Capability without Supervision
☆103Mar 4, 2026Updated 4 months ago
liumy2010 / UFT
View on GitHub
UFT: Unifying Supervised and Reinforcement Fine-Tuning
☆31Jun 30, 2025Updated last year
chanchimin / AgentMonitor
View on GitHub
Codes for our paper "AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems"
☆13Dec 13, 2024Updated last year
whn09 / VITA
View on GitHub
✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM
☆11Jun 16, 2025Updated last year