aim-uofa/SegAgent

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/aim-uofa/SegAgent)

aim-uofa / SegAgent

[CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories

☆106

Alternatives and similar repositories for SegAgent

Users that are interested in SegAgent are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

geshang777 / Seg-R1
View on GitHub
[NeurIPS-W 2025] Official Implementation of "Seg-R1: Segmentation Can Be Surprisingly Simple with Reinforcement Learning"
☆72Jul 1, 2025Updated last year
AHideoKuzeA / Evol-SAM3
View on GitHub
☆47Jan 1, 2026Updated 6 months ago
JIA-Lab-research / Seg-Zero
View on GitHub
Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"
☆635Jan 17, 2026Updated 6 months ago
aim-uofa / VLModel
View on GitHub
Repo of HawkLlama.
☆16Jan 2, 2025Updated last year
aim-uofa / DiverGen
View on GitHub
DiverGen (CVPR 2024) & BSGAL (ICML 2024)
☆53Jul 6, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
HKUST-LongGroup / STAMP
View on GitHub
[CVPR 2026] STAMP: Better, Stronger, Faster: Tackling the Trilemma in MLLM-based Segmentation with Simultaneous Textual Mask Prediction
☆39Feb 21, 2026Updated 5 months ago
aim-uofa / DiffewS
View on GitHub
[NeurIPS'24] Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation (Diffews)
☆51Apr 14, 2025Updated last year
aim-uofa / COSINE
View on GitHub
[ICCV'25] Unified Open-World Segmentation with Multi-Modal Prompts
☆16Jun 16, 2026Updated last month
hustvl / LENS
View on GitHub
[AAAI 2026 Oral] LENS: Learning to Segment Anything with Unified Reinforced Reasoning
☆136Dec 3, 2025Updated 7 months ago
aim-uofa / SINE
View on GitHub
[NeurIPS'24] A Simple Image Segmentation Framework via In-Context Examples
☆68Oct 29, 2024Updated last year
JIA-Lab-research / VisionReasoner
View on GitHub
[ICLR 2026] VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning
☆348Feb 9, 2026Updated 5 months ago
aim-uofa / GSI-Bench
View on GitHub
[CVPR2026] Exploring Spatial Intelligence from a Generative Perspective
☆30Jun 3, 2026Updated last month
aim-uofa / Active-o3
View on GitHub
[ICML2026] ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO
☆83Apr 30, 2026Updated 2 months ago
songw-zju / PixelThink
View on GitHub
The official implementation of "PixelThink: Towards Efficient Chain-of-Pixel Reasoning" (ICML 2026)
☆43Jul 4, 2026Updated 2 weeks ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
aim-uofa / OmniJigsaw
View on GitHub
☆34Apr 10, 2026Updated 3 months ago
aim-uofa / STAIR
View on GitHub
☆18Jun 13, 2026Updated last month
yayafengzi / LMM-HiMTok
View on GitHub
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
☆97Jul 17, 2025Updated last year
aim-uofa / dLLM-MidTruth
View on GitHub
[ICLR'26] Official PyTorch implementation of "Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models".
☆66Mar 5, 2026Updated 4 months ago
mc-lan / Awesome-MLLM-Segmentation
View on GitHub
A curated list of publications on image and video segmentation leveraging Multimodal Large Language Models (MLLMs), highlighting state-of…
☆229Jun 28, 2026Updated 3 weeks ago
cilinyan / ReVOS-api
View on GitHub
[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model
☆22Jul 20, 2024Updated 2 years ago
mc-lan / Text4Seg
View on GitHub
[ICLR2025] Text4Seg: Reimagining Image Segmentation as Text Generation
☆176Nov 8, 2025Updated 8 months ago
chenxi52 / CMPF
View on GitHub
[IJCV 2026] Official implementation of the paper “CMPF: Harmonizing Cross-Model Prior Fusion for Open-Vocabulary Segmentation”
☆26Jun 15, 2025Updated last year
ysj9909 / StAR
View on GitHub
[ECCV 2026] StAR: Segment Anything Reasoner
☆25Apr 2, 2026Updated 3 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Hectormxy / OP-SAM
View on GitHub
The official implementation of ICCV 25 OP-SAM "One Polyp Identifies All: One-Shot Polyp Segmentation with SAM via Cascaded Priors and Ite…
☆15Jul 9, 2025Updated last year
jcwang0602 / MLLMSeg
View on GitHub
MLLMSeg: Unlocking the Potential of MLLMs in Referring Expression Segmentation via a Light-weight Mask Decoder
☆56Jun 12, 2026Updated last month
aim-uofa / FADiff
View on GitHub
[ICML 2024] Floating Anchor Diffusion Model for Multi-motif Scaffolding
☆34Aug 23, 2024Updated last year
linyq2117 / SAMRefiner
View on GitHub
[ICLR 2025] SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement
☆99Apr 19, 2025Updated last year
Hansxsourse / VRMDiff
View on GitHub
☆11Mar 11, 2025Updated last year
likaiucas / DragOSM
View on GitHub
TPAMI Underreview paper: DragOSM
☆19Feb 26, 2026Updated 4 months ago
FudanCVL / OmniAVS
View on GitHub
[ICCV 2025] Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation
☆91Sep 29, 2025Updated 9 months ago
aim-uofa / Omni-R1
View on GitHub
[NeurIPS 2025] Official Repo of Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration
☆126Dec 3, 2025Updated 7 months ago
aim-uofa / GenDeF
View on GitHub
☆39Mar 5, 2026Updated 4 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
aim-uofa / BA-DDG
View on GitHub
[ICLR 2025 Spotlight] Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions
☆45Mar 10, 2025Updated last year
MaverickRen / PixelLM
View on GitHub
[CVPR 2024] PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding.
☆273Feb 11, 2025Updated last year
berkeley-hipie / segllm
View on GitHub
Code release for "SegLLM: Multi-round Reasoning Segmentation"
☆129Feb 20, 2025Updated last year
aim-uofa / Diception
View on GitHub
[NeurIPS 2025 Spotlight] A Generalist Diffusion Model for Vision Perception
☆318Sep 21, 2025Updated 9 months ago
rui-qian / UGround
View on GitHub
Rui Qian, Xin Yin, Chuanhang Deng, et al.: UGround: Towards Unified Visual Grounding with Unrolled Transformers (ICML 2026)
☆29Jun 18, 2026Updated last month
henghuiding / Awesome-Multimodal-Referring-Segmentation
View on GitHub
[IJCV 2026] Multimodal Referring Segmentation
☆253Jun 30, 2026Updated 3 weeks ago
PolyU-ChenLab / UniPixel
View on GitHub
🔮 UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning (NeurIPS 2025)
☆247Jan 4, 2026Updated 6 months ago