MikeWangWZHL/PAPO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MikeWangWZHL/PAPO)

MikeWangWZHL / PAPO

Official repo for "PAPO: Perception-Aware Policy Optimization for Multimodal Reasoning"

☆152

Alternatives and similar repositories for PAPO

Users that are interested in PAPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

real-absolute-AI / NoisyRollout
View on GitHub
[NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
☆112Sep 18, 2025Updated 10 months ago
xzxxntxdy / PEPO
View on GitHub
Official repo for ”Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought“
☆26Mar 29, 2026Updated 3 months ago
huaixuheqing / VPPO-RL
View on GitHub
[ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"
☆69Apr 3, 2026Updated 3 months ago
TIGER-AI-Lab / VL-Rethinker
View on GitHub
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]
☆190Jun 5, 2025Updated last year
AntResearchNLP / ViLaSR
View on GitHub
[NeurIPS 2025] Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing
☆98Jul 27, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
PKU-YuanGroup / Look-Back
View on GitHub
This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".
☆100Jul 10, 2025Updated last year
zhaochen0110 / Awesome_Think_With_Images
View on GitHub
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual in…
☆1,494Mar 9, 2026Updated 4 months ago
ByteDance-BandAI / CodeVision
View on GitHub
[CVPR 2026] Thinking with Programming Vision: Towards a Unified View for Thinking with Images
☆71Jan 23, 2026Updated 6 months ago
kxfan2002 / SophiaVL-R1
View on GitHub
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward
☆94Aug 8, 2025Updated 11 months ago
Visual-Agent / DeepEyes
View on GitHub
☆1,251Nov 20, 2025Updated 8 months ago
hiyouga / EasyR1
View on GitHub
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
☆5,082Updated this week
Tongyi-Zhiwen / QwenLong-CPRS
View on GitHub
☆86May 28, 2025Updated last year
BitSecret / HyperGNet
View on GitHub
Geometric Problem Solving Integrating FormalGeo Symbolic System and Hypergraph Neural Network.
☆16Sep 23, 2025Updated 10 months ago
VincentLeebang / lvr
View on GitHub
Official codebase for the paper Latent Visual Reasoning
☆171Oct 22, 2025Updated 9 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
NOVAglow646 / Monet
View on GitHub
[CVPR 2026] Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"
☆213Mar 19, 2026Updated 4 months ago
chenmeiqii / FIGR
View on GitHub
Official implementation of "Figure It Out: Improve the Frontier of Reasoning with Active Visual Thinking"
☆17Jan 13, 2026Updated 6 months ago
THUNLP-MT / MUSEG
View on GitHub
Repo for paper "MUSEG: Reinforcing Video Temporal Understanding via Timestamp-Aware Multi-Segment Grounding".
☆40Jun 9, 2025Updated last year
TIGER-AI-Lab / Pixel-Reasoner
View on GitHub
Pixel-Level Reasoning Model trained with RL [NeuIPS25]
☆301Jun 4, 2026Updated last month
maifoundations / Visionary-R1
View on GitHub
Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning
☆44Jul 2, 2025Updated last year
uclanlp / OpenVLThinker
View on GitHub
OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement
☆155May 25, 2026Updated 2 months ago
EvolvingLMMs-Lab / multimodal-search-r1
View on GitHub
[ACL-2026] MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal…
☆470Apr 7, 2026Updated 3 months ago
Mini-o3 / Mini-o3
View on GitHub
Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"
☆423Jan 29, 2026Updated 5 months ago
PKU-ICST-MIPL / FineR1_ICLR2026
View on GitHub
☆68Apr 4, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Osilly / Awesome-Interleaving-Reasoning
View on GitHub
Interleaving Reasoning: Next-Generation Reasoning Systems for AGI
☆280Jun 5, 2026Updated last month
TideDra / lmm-r1
View on GitHub
Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.
☆848May 14, 2025Updated last year
Sun-Haoyuan23 / Awesome-RL-based-Reasoning-MLLMs
View on GitHub
This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-bas…
☆1,435May 11, 2026Updated 2 months ago
yix8 / VisualPlanning
View on GitHub
[ICLR 2026 Oral] Visual Planning: Let's Think Only with Images
☆365Apr 24, 2026Updated 3 months ago
zss02 / BiPS
View on GitHub
[CVPR 2026] See Less, See Right: Bi-directional Perceptual Shaping For Multimodal Reasoning
☆21Jun 28, 2026Updated 3 weeks ago
inclusionAI / M2-Reasoning
View on GitHub
M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning
☆47Jul 17, 2025Updated last year
xlyu0106 / ViF
View on GitHub
[ICLR 26] Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow
☆44Oct 3, 2025Updated 9 months ago
Dtc7w3PQ / PRCO
View on GitHub
Official implementation of Seeing with You: Perception-Reasoning Co-evolution for Multimodal Reasoning.
☆30Jul 2, 2026Updated 3 weeks ago
uni-medical / GMAI-VL-R1
View on GitHub
☆19Jul 21, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
zhaochen0110 / OpenThinkIMG
View on GitHub
OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.
☆399Jun 1, 2025Updated last year
Tongyi-Zhiwen / Qwen-Doc
View on GitHub
☆548May 25, 2026Updated 2 months ago
UCSB-AI / DMLR
View on GitHub
[CVPR2026] Official codebase for the paper "Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space"
☆84May 12, 2026Updated 2 months ago
zli12321 / Vision-SR1
View on GitHub
Reinforcement Learning of Vision Language Models with Self Visual Perception Reward
☆175Mar 14, 2026Updated 4 months ago
LiangThree / MCMA
View on GitHub
☆16Jan 12, 2026Updated 6 months ago
UCSB-AI / GRIT
View on GitHub
Official code for NeurIPS 2025 paper "GRIT: Teaching MLLMs to Think with Images"
☆190Jan 16, 2026Updated 6 months ago
InternLM / Spatial-SSRL
View on GitHub
[CVPR 2026] Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"
☆133Apr 7, 2026Updated 3 months ago