[ICLR 2026] Geometric-Mean Policy Optimization
☆103Jan 26, 2026Updated 3 months ago
Alternatives and similar repositories for GMPO
Users that are interested in GMPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICCV 2023] Generative Prompt Model for Weakly Supervised Object Localization☆57Nov 10, 2023Updated 2 years ago
- ☆18Mar 2, 2026Updated 2 months ago
- [ICLR26]GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning☆182Jan 29, 2026Updated 3 months ago
- [ECCV 2024] ControlCap: Controllable Region-level Captioning☆81Oct 25, 2024Updated last year
- Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model☆13Feb 11, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Understanding R1-Zero-Like Training: A Critical Perspective☆1,259Aug 27, 2025Updated 8 months ago
- [CVPR 2025] DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution☆59Mar 4, 2025Updated last year
- Rethinking the Trust Region in LLM Reinforcement Learning☆54Mar 2, 2026Updated 2 months ago
- [EMNLP 2024 Tutorial] Language Agents: Foundations, Prospects, and Risks☆10Nov 27, 2024Updated last year
- PyTorch code for our paper "Progressive Binarization with Semi-Structured Pruning for LLMs"☆13Mar 11, 2026Updated 2 months ago
- [AAAI2025] ChatterBox: Multi-round Multimodal Referring and Grounding, Multimodal, Multi-round dialogues☆61May 2, 2025Updated last year
- [NeurIPS 2024] Artemis: Towards Referential Understanding in Complex Videos☆27Apr 8, 2025Updated last year
- The official python toolkit for running experiments and evaluate performance on VideoCube benchmark @TPAMI2023☆31Apr 1, 2024Updated 2 years ago
- [ICME 2023] FlowText: Synthesizing Realistic Scene Text Video with Optical Flow Estimation☆13May 13, 2023Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆20Nov 5, 2024Updated last year
- ☆34Nov 18, 2025Updated 6 months ago
- Codes for our paper "AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems"☆13Dec 13, 2024Updated last year
- ☆39Nov 18, 2025Updated 6 months ago
- TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25☆92Jun 16, 2025Updated 11 months ago
- C3D,R(21)D,R3D--pytorch☆10Sep 11, 2018Updated 7 years ago
- [ICLR 2026] Quantile Advantage Estimation for Entropy-Safe Reasoning☆28Oct 14, 2025Updated 7 months ago
- [ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concen…☆86Jun 20, 2025Updated 11 months ago
- ☆11May 18, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Official implementation of "Seeing is Understanding: Unlocking Causal Attention into Modality-Mutual Attention for Multimodal LLMs"☆20May 11, 2026Updated 2 weeks ago
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…☆87Jun 17, 2024Updated last year
- The original Shared Recurrent Memory Transformer implementation☆36Jul 11, 2025Updated 10 months ago
- The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.☆51Oct 18, 2024Updated last year
- [CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding☆213Dec 19, 2025Updated 5 months ago
- ☆64Mar 30, 2026Updated last month
- Extrapolating RLVR to General Domains without Verifiers☆203Aug 12, 2025Updated 9 months ago
- Repo for paper "CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models".☆12Oct 14, 2024Updated last year
- ☆46Sep 27, 2025Updated 7 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning [ICLR26]☆64Apr 11, 2026Updated last month
- [ICLR26] GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning☆105Jan 27, 2026Updated 3 months ago
- [ICML 2025] SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity☆75Mar 10, 2026Updated 2 months ago
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆86Mar 21, 2024Updated 2 years ago
- MetaAgent: Toward Self-Evolving Agent via Tool Meta-Learning☆45Sep 3, 2025Updated 8 months ago
- This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"☆31Dec 23, 2024Updated last year
- ☆11Aug 26, 2021Updated 4 years ago