JIA-Lab-research/VisionReasoner

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/JIA-Lab-research/VisionReasoner)

JIA-Lab-research / VisionReasoner

[ICLR 2026] VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning

☆348

Alternatives and similar repositories for VisionReasoner

Users that are interested in VisionReasoner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

JIA-Lab-research / Seg-Zero
View on GitHub
Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"
☆635Jan 17, 2026Updated 6 months ago
songw-zju / PixelThink
View on GitHub
The official implementation of "PixelThink: Towards Efficient Chain-of-Pixel Reasoning" (ICML 2026)
☆43Jul 4, 2026Updated 2 weeks ago
ysj9909 / StAR
View on GitHub
[ECCV 2026] StAR: Segment Anything Reasoner
☆25Apr 2, 2026Updated 3 months ago
PolyU-ChenLab / UniPixel
View on GitHub
🔮 UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning (NeurIPS 2025)
☆247Jan 4, 2026Updated 6 months ago
hq-King / Affordance-R1
View on GitHub
code for affordance-r1
☆73May 11, 2026Updated 2 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
yayafengzi / ALToLLM
View on GitHub
ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation
☆30May 27, 2025Updated last year
Mini-o3 / Mini-o3
View on GitHub
Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"
☆422Jan 29, 2026Updated 5 months ago
hustvl / LENS
View on GitHub
[AAAI 2026 Oral] LENS: Learning to Segment Anything with Unified Reinforced Reasoning
☆136Dec 3, 2025Updated 7 months ago
aim-uofa / SegAgent
View on GitHub
[CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
☆106Aug 8, 2025Updated 11 months ago
hiyouga / EasyR1
View on GitHub
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
☆5,071Updated this week
geshang777 / Seg-R1
View on GitHub
[NeurIPS-W 2025] Official Implementation of "Seg-R1: Segmentation Can Be Surprisingly Simple with Reinforcement Learning"
☆72Jul 1, 2025Updated last year
Visual-Agent / DeepEyes
View on GitHub
☆1,249Nov 20, 2025Updated 8 months ago
Liuziyu77 / Visual-RFT
View on GitHub
Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’
☆2,261Oct 29, 2025Updated 8 months ago
TIGER-AI-Lab / VL-Rethinker
View on GitHub
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]
☆189Jun 5, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
JIA-Lab-research / LISA
View on GitHub
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
☆2,662Feb 16, 2025Updated last year
JIA-Lab-research / VisionThink
View on GitHub
[NeurIPS 2025] Efficient Reasoning Vision Language Models
☆459Sep 18, 2025Updated 10 months ago
suikei-wang / RESAnything
View on GitHub
[NeurIPS 2025] RESAnything: Attribute Prompting for Arbitrary Referring Segmentation
☆19May 26, 2026Updated last month
AntResearchNLP / ViLaSR
View on GitHub
[NeurIPS 2025] Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing
☆98Jul 27, 2025Updated 11 months ago
zhaochen0110 / Awesome_Think_With_Images
View on GitHub
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual in…
☆1,492Mar 9, 2026Updated 4 months ago
eVI-group-SCU / Dr-Seg
View on GitHub
[CVPR'26] Dr. Seg: Revisiting GRPO Training for Visual Large Language Models through Perception-Oriented Design
☆31Mar 7, 2026Updated 4 months ago
nnnth / UFO
View on GitHub
[NeurIPS2025 Spotlight 🔥 ] Official implementation of 🛸 "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Langu…
☆280Nov 5, 2025Updated 8 months ago
ls-kelvin / REVPT
View on GitHub
Code for paper: Reinforced Vision Perception with Tools
☆74Oct 3, 2025Updated 9 months ago
tulerfeng / Video-R1
View on GitHub
Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]
☆879Dec 14, 2025Updated 7 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
rui-qian / READ
View on GitHub
Rui Qian, Xin Yin, Dejing Dou†: Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)
☆54Feb 4, 2026Updated 5 months ago
MaverickRen / PixelLM
View on GitHub
[CVPR 2024] PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding.
☆273Feb 11, 2025Updated last year
wanghao9610 / X-SAM
View on GitHub
[AAAI2026] X-SAM: From Segment Anything to Any Segmentation
☆382Jul 14, 2026Updated last week
IDEA-Research / Rex-Omni
View on GitHub
[CVPR2026] Detect Anything via Next Point Prediction
☆1,507Feb 22, 2026Updated 4 months ago
TIGER-AI-Lab / Pixel-Reasoner
View on GitHub
Pixel-Level Reasoning Model trained with RL [NeuIPS25]
☆300Jun 4, 2026Updated last month
om-ai-lab / VLM-R1
View on GitHub
Solve Visual Understanding with Reinforced VLMs
☆6,010Jul 7, 2026Updated 2 weeks ago
AMAP-ML / UniVG-R1
View on GitHub
UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning
☆165Jun 2, 2025Updated last year
henghuiding / Awesome-Multimodal-Referring-Segmentation
View on GitHub
[IJCV 2026] Multimodal Referring Segmentation
☆253Jun 30, 2026Updated 3 weeks ago
mc-lan / Text4Seg
View on GitHub
[ICLR2025] Text4Seg: Reimagining Image Segmentation as Text Generation
☆176Nov 8, 2025Updated 8 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
rui-qian / UGround
View on GitHub
Rui Qian, Xin Yin, Chuanhang Deng, et al.: UGround: Towards Unified Visual Grounding with Unrolled Transformers (ICML 2026)
☆29Jun 18, 2026Updated last month
Haochen-Wang409 / TreeVGR
View on GitHub
[ICLR'26] Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology
☆92Jan 26, 2026Updated 5 months ago
jefferyZhan / Griffon
View on GitHub
Official repo of Griffon series including v1(ECCV 2024), v2(ICCV 2025), G, and R, and also the RL tool Vision-R1(CVPR 2026).
☆250Apr 17, 2026Updated 3 months ago
EvolvingLMMs-Lab / open-r1-multimodal
View on GitHub
A fork to add multimodal model training to open-r1
☆1,590Feb 8, 2025Updated last year
IDEA-Research / Rex-Thinker
View on GitHub
[ICLR-2026] Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning
☆150Jun 30, 2025Updated last year
IDEA-Research / ChatRex
View on GitHub
Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
☆216Oct 15, 2025Updated 9 months ago
jdg900 / MMR
View on GitHub
[ICLR 2025] Official Pytorch Implementation of MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segm…
☆28Apr 3, 2025Updated last year