HKUST-LongGroup/STAMP

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/HKUST-LongGroup/STAMP)

HKUST-LongGroup / STAMP

[CVPR 2026] STAMP: Better, Stronger, Faster: Tackling the Trilemma in MLLM-based Segmentation with Simultaneous Textual Mask Prediction

☆39

Alternatives and similar repositories for STAMP

Users that are interested in STAMP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

HKUST-LongGroup / DyME
View on GitHub
[ICLR 2026] Empowering Small VLMs to Think with Dynamic Memorization and Exploration
☆18Mar 18, 2026Updated 4 months ago
rui-qian / UGround
View on GitHub
Rui Qian, Xin Yin, Chuanhang Deng, et al.: UGround: Towards Unified Visual Grounding with Unrolled Transformers (ICML 2026)
☆29Jun 18, 2026Updated last month
rui-qian / READ
View on GitHub
Rui Qian, Xin Yin, Dejing Dou†: Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)
☆54Feb 4, 2026Updated 5 months ago
yayafengzi / ALToLLM
View on GitHub
ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation
☆30May 27, 2025Updated last year
songw-zju / PixelThink
View on GitHub
The official implementation of "PixelThink: Towards Efficient Chain-of-Pixel Reasoning" (ICML 2026)
☆43Jul 4, 2026Updated 3 weeks ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
mc-lan / Awesome-MLLM-Segmentation
View on GitHub
A curated list of publications on image and video segmentation leveraging Multimodal Large Language Models (MLLMs), highlighting state-of…
☆231Jun 28, 2026Updated 3 weeks ago
aim-uofa / SegAgent
View on GitHub
[CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
☆107Aug 8, 2025Updated 11 months ago
ZhenyuLU-Heliodore / CoPRS
View on GitHub
Project Page for ICLR'26: CoPRS, offering training overview, inference code, and downloadable links.
☆22Mar 17, 2026Updated 4 months ago
baoxiaoyi / CoReS
View on GitHub
code for the paper "CoReS: Orchestrating the Dance of Reasoning and Segmentation"
☆23Nov 24, 2025Updated 8 months ago
AI-Application-and-Integration-Lab / SAM4MLLM
View on GitHub
[ECCV 2024] SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation
☆51Mar 20, 2025Updated last year
TheEighthDay / SeekWorld
View on GitHub
The first attempt to replicate o3-like visual clue-tracking reasoning capabilities.
☆64Jul 8, 2025Updated last year
eVI-group-SCU / Dr-Seg
View on GitHub
[CVPR'26] Dr. Seg: Revisiting GRPO Training for Visual Large Language Models through Perception-Oriented Design
☆31Mar 7, 2026Updated 4 months ago
li-xirong / mmc-amd
View on GitHub
Multi-modal categorization of Age-related Macular Degeneration (4 classes: normal, dry AMD, pcv, wet AMD)
☆32Jun 22, 2026Updated last month
jiazhen-code / PhD
View on GitHub
[CVPR25 Highlight] A ChatGPT-Prompted Visual hallucination Evaluation Dataset, featuring over 100,000 data samples and four advanced eval…
☆32Apr 16, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
itayle / diverse-demonstrations
View on GitHub
Diverse Demonstrations Improve In-context Compositional Generalization
☆13Jul 7, 2023Updated 3 years ago
AAwcAA / WOW-Seg-Meta
View on GitHub
☆35Updated this week
xuyanyu-shh / Personalized-Saliency
View on GitHub
Beyond Universal Saliency: Personalized Saliency Prediction with Multi-task CNN (IJCAI 2017 and TPAMI)
☆11Jan 17, 2019Updated 7 years ago
yayafengzi / LMM-HiMTok
View on GitHub
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
☆97Jul 17, 2025Updated last year
JIA-Lab-research / Seg-Zero
View on GitHub
Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"
☆635Jan 17, 2026Updated 6 months ago
geshang777 / Seg-R1
View on GitHub
[NeurIPS-W 2025] Official Implementation of "Seg-R1: Segmentation Can Be Surprisingly Simple with Reinforcement Learning"
☆72Jul 1, 2025Updated last year
TArdelean / AnomalyLocalizationFCA
View on GitHub
Official implementation of High-Fidelity Zero-Shot Texture Anomaly Localization Using Feature Correspondence Analysis.
☆12Dec 18, 2023Updated 2 years ago
mc-lan / Text4Seg
View on GitHub
[ICLR2025] Text4Seg: Reimagining Image Segmentation as Text Generation
☆177Nov 8, 2025Updated 8 months ago
nnnth / UFO
View on GitHub
[NeurIPS2025 Spotlight 🔥 ] Official implementation of 🛸 "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Langu…
☆281Nov 5, 2025Updated 8 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
haraldger / DRL-DecisionTransformer
View on GitHub
Research project for Deep Reinforcement Learning using Decision Transformer
☆16May 12, 2023Updated 3 years ago
digtrade / digtrade
View on GitHub
Trading Consequences data and code
☆15Mar 5, 2015Updated 11 years ago
wangjiangshan0725 / COVE
View on GitHub
[NeurIPS 2024] COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing
☆26Dec 8, 2024Updated last year
JIA-Lab-research / VisionReasoner
View on GitHub
[ICLR 2026] VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning
☆348Feb 9, 2026Updated 5 months ago
Gorilla-Lab-SCUT / PaDT
View on GitHub
[ICLR 2026] Official implementation of "Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs"
☆162Oct 31, 2025Updated 8 months ago
XueyuLiu / PPO
View on GitHub
The official implementation code for Plug-and-Play PPO: An Adaptive Point Prompt Optimizer Making SAM Greater.
☆38Jan 28, 2026Updated 5 months ago
FYYDCC / IVT-LR
View on GitHub
Official repository for “Reasoning in the Dark: Interleaved Vision-Text Reasoning in Latent Space”
☆18Jan 27, 2026Updated 5 months ago
mercurystraw / Kris_Bench
View on GitHub
[NIPS 25'] Evaluation code of paper "KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models"
☆46Oct 19, 2025Updated 9 months ago
ForJadeForest / LIVE-Learnable-In-Context-Vector
View on GitHub
【NeurIPS 2024】The implementation of LIVE: Learnable In-Context Vector for Visual Question Answering https://arxiv.org/abs/2406.13185
☆23May 31, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
harkerhand / seu_utils
View on GitHub
适用于东南大学学生的工具集合
☆16Sep 10, 2025Updated 10 months ago
OpenGVLab / Mono-InternVL
View on GitHub
[CVPR 2025] Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
☆109Jul 18, 2025Updated last year
linsun449 / cliper.code
View on GitHub
This repo is the official pytorch implementation of the paper: CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-V…
☆42Sep 10, 2025Updated 10 months ago
see-say-segment / sesame
View on GitHub
🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"
☆47Jun 16, 2024Updated 2 years ago
zyc00 / point-sam-demo
View on GitHub
☆11Jun 19, 2024Updated 2 years ago
SoccerNet / sn-teamspotting
View on GitHub
DevKit for SoccerNet Team Action Spotting Challenge 2025
☆19Aug 26, 2025Updated 10 months ago
kevjshih / wtl_vqa
View on GitHub
Released code for the paper: Where To Look: Focus Regions for Visual Question Answering. (CVPR2016)
☆10Apr 8, 2020Updated 6 years ago