yayafengzi/LMM-HiMTok

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yayafengzi/LMM-HiMTok)

yayafengzi / LMM-HiMTok

HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model

☆98

Alternatives and similar repositories for LMM-HiMTok

Users that are interested in LMM-HiMTok are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yayafengzi / ALToLLM
View on GitHub
ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation
☆30May 27, 2025Updated last year
rui-qian / UGround
View on GitHub
Rui Qian, Xin Yin, Chuanhang Deng, et al.: UGround: Towards Unified Visual Grounding with Unrolled Transformers (ICML 2026)
☆29Jun 18, 2026Updated last month
xuliu-cyber / RSUniVLM
View on GitHub
☆47Apr 16, 2026Updated 3 months ago
JIA-Lab-research / Seg-Zero
View on GitHub
Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"
☆636Jan 17, 2026Updated 6 months ago
congvvc / HyperSeg
View on GitHub
[CVPR2025] Project for "HyperSeg: Towards Universal Visual Segmentation with Large Language Model".
☆183Dec 13, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
earth-insights / awesome-MLLM-for-image-segmentation
View on GitHub
Paper list for LLM/MLLM-based image segmentation
☆48Dec 24, 2025Updated 7 months ago
nnnth / UFO
View on GitHub
[NeurIPS2025 Spotlight 🔥 ] Official implementation of 🛸 "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Langu…
☆281Nov 5, 2025Updated 8 months ago
ccx1997 / crnn_ctc_pytorch1.0
View on GitHub
CRNN_CTC_PyTorch
☆10Oct 17, 2019Updated 6 years ago
Rapisurazurite / FFDN
View on GitHub
Implementation for Enhancing Tampered Text Detection Through Frequency Feature Fusion and Decomposition
☆30Feb 26, 2025Updated last year
mc-lan / Awesome-MLLM-Segmentation
View on GitHub
A curated list of publications on image and video segmentation leveraging Multimodal Large Language Models (MLLMs), highlighting state-of…
☆231Jun 28, 2026Updated last month
mc-lan / Text4Seg
View on GitHub
[ICLR2025] Text4Seg: Reimagining Image Segmentation as Text Generation
☆177Nov 8, 2025Updated 8 months ago
aim-uofa / SegAgent
View on GitHub
[CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
☆108Aug 8, 2025Updated 11 months ago
earth-insights / RS-MTDF
View on GitHub
RS-MTDF: Multi-Teacher Distillation and Fusion for Remote Sensing Semi-Supervised Semantic Segmentation
☆22Jun 15, 2025Updated last year
earth-insights / Advanced-Earth-Observation
View on GitHub
Paper List on Earth Observation in the Foundation Model Era
☆31Jun 15, 2026Updated last month
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
MiliLab / UniGeoSeg
View on GitHub
Official repo for [CVPR 2026] "UniGeoSeg: Towards Unified Open-World Segmentation for Geospatial Scenes"
☆34Mar 30, 2026Updated 4 months ago
yilmazkorkmaz1 / referring_change_detection
View on GitHub
Referring Change Detection in Remote Sensing Imagery
☆15Jan 17, 2026Updated 6 months ago
berkeley-hipie / segllm
View on GitHub
Code release for "SegLLM: Multi-round Reasoning Segmentation"
☆129Feb 20, 2025Updated last year
alipay / POA
View on GitHub
☆22Aug 8, 2024Updated last year
MaverickRen / PixelLM
View on GitHub
[CVPR 2024] PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding.
☆273Feb 11, 2025Updated last year
lsa1997 / CARIS
View on GitHub
Code for "CARIS: Context-Aware Referring Image Segmentation" [ACM MM2023]
☆30Nov 28, 2024Updated last year
lisat-bair / LISAt_code
View on GitHub
☆30Sep 2, 2025Updated 10 months ago
hustvl / LENS
View on GitHub
[AAAI 2026 Oral] LENS: Learning to Segment Anything with Unified Reinforced Reasoning
☆140Dec 3, 2025Updated 7 months ago
VisionXLab / DVGBench
View on GitHub
[ISPRS2026] DVGBench: Implicit-to-Explicit Visual Grounding Benchmark in UAV Imagery with Large Vision-Language Models
☆30Mar 24, 2026Updated 4 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
DanielSHKao / ThinkFirst
View on GitHub
Official implementation for "Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts"
☆22Jun 28, 2025Updated last year
songw-zju / PixelThink
View on GitHub
The official implementation of "PixelThink: Towards Efficient Chain-of-Pixel Reasoning" (ICML 2026)
☆43Jul 4, 2026Updated 3 weeks ago
rui-qian / READ
View on GitHub
Rui Qian, Xin Yin, Dejing Dou†: Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)
☆54Feb 4, 2026Updated 5 months ago
congvvc / InstructSeg
View on GitHub
[ICCV 2025] Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"
☆56Feb 10, 2025Updated last year
linyq2117 / SAMRefiner
View on GitHub
[ICLR 2025] SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement
☆100Apr 19, 2025Updated last year
zhouyiks / CoLVA
View on GitHub
☆44Jul 9, 2025Updated last year
NJU-LHRS / ScoreRS
View on GitHub
Code and updates for the ScoreRS project.
☆44Sep 19, 2025Updated 10 months ago
wanghao9610 / X-SAM
View on GitHub
[AAAI2026] X-SAM: From Segment Anything to Any Segmentation
☆386Jul 14, 2026Updated 2 weeks ago
PolyU-ChenLab / UniPixel
View on GitHub
🔮 UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning (NeurIPS 2025)
☆247Jan 4, 2026Updated 6 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
XIEFOX / PixDLM
View on GitHub
[CVPR2026]PixDLM: A Dual-Path Multimodal Language Model for UAV Reasoning Segmentation
☆27Jul 9, 2026Updated 2 weeks ago
Geo-R1 / geo-r1
View on GitHub
☆16Sep 25, 2025Updated 10 months ago
DerrickWang005 / Unpair-Seg.pytorch
View on GitHub
Uni-OVSeg is a weakly supervised open-vocabulary segmentation framework that leverages unpaired mask-text pairs.
☆54Jun 11, 2024Updated 2 years ago
LiBingyu01 / MTRefSeg
View on GitHub
An Open-Source Benchmark and Baseline for Multi-temporal Referring Segmentation
☆54Jun 4, 2026Updated last month
jcwang0602 / MLLMSeg
View on GitHub
MLLMSeg: Unlocking the Potential of MLLMs in Referring Expression Segmentation via a Light-weight Mask Decoder
☆57Jun 12, 2026Updated last month
zhenjiemao / aRefCOCO
View on GitHub
[NeurIPS 2025] "SaFiRe: Saccade-Fixation Reiteration with Mamba for Referring Image Segmentation" https://arxiv.org/pdf/2510.10160
☆15Nov 26, 2025Updated 8 months ago
geshang777 / Seg-R1
View on GitHub
[NeurIPS-W 2025] Official Implementation of "Seg-R1: Segmentation Can Be Surprisingly Simple with Reinforcement Learning"
☆72Jul 1, 2025Updated last year