PolyU-ChenLab/UniPixel

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/PolyU-ChenLab/UniPixel)

PolyU-ChenLab / UniPixel

🔮 UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning (NeurIPS 2025)

☆247

Alternatives and similar repositories for UniPixel

Users that are interested in UniPixel are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hustvl / LENS
View on GitHub
[AAAI 2026 Oral] LENS: Learning to Segment Anything with Unified Reinforced Reasoning
☆136Dec 3, 2025Updated 7 months ago
qirui-chen / RGA3-release
View on GitHub
[ICCV 2025] Object-centric Video Question Answering with Visual Grounding and Referring
☆24Aug 8, 2025Updated 11 months ago
JIA-Lab-research / VisionReasoner
View on GitHub
[ICLR 2026] VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning
☆348Feb 9, 2026Updated 5 months ago
Gorilla-Lab-SCUT / PaDT
View on GitHub
[ICLR 2026] Official implementation of "Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs"
☆161Oct 31, 2025Updated 8 months ago
HYUNJS / DecAF
View on GitHub
[ICLR 2026] Official implementation of "Decomposed Attention Fusion in MLLMs for Training-Free Video Reasoning Segmentation"
☆35Jan 26, 2026Updated 5 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
SalesforceAIResearch / strefer
View on GitHub
Strefer: Empowering Video LLMs with Space-Time Referring and Reasoning via Synthetic Instruction Data
☆19Jun 2, 2026Updated last month
JIA-Lab-research / Seg-Zero
View on GitHub
Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"
☆635Jan 17, 2026Updated 6 months ago
IDEA-Research / Rex-Omni
View on GitHub
[CVPR2026] Detect Anything via Next Point Prediction
☆1,507Feb 22, 2026Updated 4 months ago
QuentinFitteRey / VLMSAM
View on GitHub
Qwen-SAM is a reasoning-based segmentation model that integrates Qwen 2.5 VL 7B with the Segment Anything Model (SAM), enabling fine-grai…
☆32Jun 4, 2025Updated last year
rui-qian / UGround
View on GitHub
Rui Qian, Xin Yin, Chuanhang Deng, et al.: UGround: Towards Unified Visual Grounding with Unrolled Transformers (ICML 2026)
☆29Jun 18, 2026Updated last month
wanghao9610 / X-SAM
View on GitHub
[AAAI2026] X-SAM: From Segment Anything to Any Segmentation
☆382Jul 14, 2026Updated last week
bytedance / Sa2VA
View on GitHub
Official Repo For Pixel-LLM Codebase: Sa2VA (Arxiv-25), SAMTok (CVPR-26), VRT, SaSaSa2VA (1-st solution for LSVOS)
☆1,650Jun 19, 2026Updated last month
zamling / PSALM
View on GitHub
[ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"
☆269Dec 30, 2024Updated last year
mc-lan / Awesome-MLLM-Segmentation
View on GitHub
A curated list of publications on image and video segmentation leveraging Multimodal Large Language Models (MLLMs), highlighting state-of…
☆229Jun 28, 2026Updated 3 weeks ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
facebookresearch / DepthLM_Official
View on GitHub
[ICLR 2026 Oral (top 1.2%)] Official implementation of DepthLM
☆362Jun 1, 2026Updated last month
DanielSHKao / CoT-RVS
View on GitHub
[ICLR 2026] Official implementation for CoT-RVS
☆23Mar 17, 2026Updated 4 months ago
cilinyan / VISA
View on GitHub
[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model
☆213Aug 5, 2024Updated last year
appletea233 / LLaVA-ST
View on GitHub
[CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding
☆84Jul 4, 2025Updated last year
Hanzy1996 / OpenSeg-R
View on GitHub
OpenSeg-R: Improving Open-Vocabulary Segmentation via Step-by-Step Visual Reasoning
☆29May 24, 2025Updated last year
echo840 / LIRA
View on GitHub
[ICCV 2025] LIRA
☆22Nov 25, 2025Updated 7 months ago
GLUS-video / GLUS
View on GitHub
[CVPR 2025] Official PyTorch Implementation of GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmenta…
☆70Jun 23, 2025Updated last year
jdg900 / MMR
View on GitHub
[ICLR 2025] Official Pytorch Implementation of MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segm…
☆28Apr 3, 2025Updated last year
hustvl / MaskAdapter
View on GitHub
[CVPR 2025] Official repository of the paper "Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation"
☆135Oct 23, 2025Updated 8 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
earth-insights / awesome-MLLM-for-image-segmentation
View on GitHub
Paper list for LLM/MLLM-based image segmentation
☆48Dec 24, 2025Updated 6 months ago
henghuiding / Awesome-Multimodal-Referring-Segmentation
View on GitHub
[IJCV 2026] Multimodal Referring Segmentation
☆253Jun 30, 2026Updated 3 weeks ago
aim-uofa / SegAgent
View on GitHub
[CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
☆106Aug 8, 2025Updated 11 months ago
jbistanbul / universalvtg
View on GitHub
Official Code for the paper "UniversalVTG: A Univeral and Lightweight Foundation Model for Video Temporal Grounding"
☆15Apr 15, 2026Updated 3 months ago
yayafengzi / LMM-HiMTok
View on GitHub
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
☆97Jul 17, 2025Updated last year
VoyagerXvoyagerx / InstructSAM
View on GitHub
InstructSAM: A Training-Free Framework for Instruction-Oriented Remote Sensing Object Recognition (NeurIPS 2025)
☆114Feb 28, 2026Updated 4 months ago
Mini-o3 / Mini-o3
View on GitHub
Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"
☆422Jan 29, 2026Updated 5 months ago
1e12Leon / RemoteSAM
View on GitHub
[ACM MM 25] Official repo of "RemoteSAM: Towards Segment Anything for Earth Observation"
☆243Jan 4, 2026Updated 6 months ago
congvvc / InstructSeg
View on GitHub
[ICCV 2025] Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"
☆56Feb 10, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
geshang777 / Seg-R1
View on GitHub
[NeurIPS-W 2025] Official Implementation of "Seg-R1: Segmentation Can Be Surprisingly Simple with Reinforcement Learning"
☆72Jul 1, 2025Updated last year
yayafengzi / ALToLLM
View on GitHub
ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation
☆30May 27, 2025Updated last year
yeliudev / VideoMind
View on GitHub
🧠 VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning (ICLR 2026)
☆346Feb 8, 2026Updated 5 months ago
debby-0527 / SAM3-I
View on GitHub
Official code and resources for SAM3-I.
☆175Apr 14, 2026Updated 3 months ago
chenwei746 / EEVG
View on GitHub
☆23Aug 20, 2024Updated last year
ZhenyuLU-Heliodore / CoPRS
View on GitHub
Project Page for ICLR'26: CoPRS, offering training overview, inference code, and downloadable links.
☆22Mar 17, 2026Updated 4 months ago
VincentLeebang / lvr
View on GitHub
Official codebase for the paper Latent Visual Reasoning
☆169Oct 22, 2025Updated 8 months ago