berkeley-hipie/segllm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/berkeley-hipie/segllm)

berkeley-hipie / segllm

Code release for "SegLLM: Multi-round Reasoning Segmentation"

☆129

Alternatives and similar repositories for segllm

Users that are interested in segllm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

see-say-segment / sesame
View on GitHub
🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"
☆47Jun 16, 2024Updated 2 years ago
baoxiaoyi / CoReS
View on GitHub
code for the paper "CoReS: Orchestrating the Dance of Reasoning and Segmentation"
☆23Nov 24, 2025Updated 8 months ago
mc-lan / Text4Seg
View on GitHub
[ICLR2025] Text4Seg: Reimagining Image Segmentation as Text Generation
☆177Nov 8, 2025Updated 8 months ago
LeapLabTHU / GSVA
View on GitHub
[CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models
☆166Sep 12, 2024Updated last year
congvvc / InstructSeg
View on GitHub
[ICCV 2025] Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"
☆56Feb 10, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Shengcao-Cao / groundLMM
View on GitHub
Emergent Visual Grounding in Large Multimodal Models Without Grounding Supervision
☆47Oct 19, 2025Updated 9 months ago
JIA-Lab-research / LISA
View on GitHub
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
☆2,667Feb 16, 2025Updated last year
JIA-Lab-research / Seg-Zero
View on GitHub
Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"
☆635Jan 17, 2026Updated 6 months ago
rui-qian / READ
View on GitHub
Rui Qian, Xin Yin, Dejing Dou†: Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)
☆54Feb 4, 2026Updated 5 months ago
mc-lan / Awesome-MLLM-Segmentation
View on GitHub
A curated list of publications on image and video segmentation leveraging Multimodal Large Language Models (MLLMs), highlighting state-of…
☆231Jun 28, 2026Updated 3 weeks ago
MaverickRen / PixelLM
View on GitHub
[CVPR 2024] PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding.
☆273Feb 11, 2025Updated last year
jdg900 / MMR
View on GitHub
[ICLR 2025] Official Pytorch Implementation of MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segm…
☆28Apr 3, 2025Updated last year
wusize / F-LMM
View on GitHub
[CVPR2025] Code Release of F-LMM: Grounding Frozen Large Multimodal Models
☆115May 29, 2025Updated last year
showlab / VideoLISA
View on GitHub
[NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos
☆148Dec 26, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
mbzuai-oryx / groundingLMM
View on GitHub
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses tha…
☆964Aug 5, 2025Updated 11 months ago
rkzheng99 / ViLLa
View on GitHub
Video Reasoning Segmentation
☆26Nov 29, 2024Updated last year
wangjunchi / LLMSeg
View on GitHub
LLM-Seg: Bridging Image Segmentation and Large Language Model Reasoning
☆194Apr 16, 2024Updated 2 years ago
AI-Application-and-Integration-Lab / SAM4MLLM
View on GitHub
[ECCV 2024] SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation
☆51Mar 20, 2025Updated last year
aim-uofa / SegAgent
View on GitHub
[CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
☆107Aug 8, 2025Updated 11 months ago
ut-vision / ActionVOS
View on GitHub
[ECCV 2024 Oral] ActionVOS: Actions as Prompts for Video Object Segmentation
☆32Dec 4, 2024Updated last year
yxchng / mask-grounding
View on GitHub
[CVPR2024] Mask Grounding for Referring Image Segmentation
☆29Jul 22, 2024Updated 2 years ago
CASIA-IVA-Lab / MRES
View on GitHub
This repo holds the official code and data for "Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentati…
☆74Jun 3, 2024Updated 2 years ago
chenwei746 / EEVG
View on GitHub
☆23Aug 20, 2024Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
hustvl / EVF-SAM
View on GitHub
Official code of "EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model"
☆505Mar 17, 2025Updated last year
cilinyan / VISA
View on GitHub
[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model
☆213Aug 5, 2024Updated last year
JIA-Lab-research / VisionReasoner
View on GitHub
[ICLR 2026] VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning
☆348Feb 9, 2026Updated 5 months ago
yayafengzi / LMM-HiMTok
View on GitHub
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
☆97Jul 17, 2025Updated last year
lizhou-cs / mglmm
View on GitHub
☆32Jun 14, 2026Updated last month
yayafengzi / ALToLLM
View on GitHub
ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation
☆30May 27, 2025Updated last year
rongfu-dsb / MPG-SAM2
View on GitHub
[ICCV 2025] MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation
☆23Sep 5, 2025Updated 10 months ago
nnnth / UFO
View on GitHub
[NeurIPS2025 Spotlight 🔥 ] Official implementation of 🛸 "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Langu…
☆281Nov 5, 2025Updated 8 months ago
zamling / PSALM
View on GitHub
[ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"
☆269Dec 30, 2024Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
berkeley-hipie / HIPIE
View on GitHub
[NeurIPS2023] Code release for "Hierarchical Open-vocabulary Universal Image Segmentation"
☆294Jun 19, 2025Updated last year
xk-huang / segment-caption-anything
View on GitHub
[CVPR'24] The repository provides code for running inference and training for "Segment and Caption Anything" (SCA) , links for downloadin…
☆233Sep 30, 2024Updated last year
dengandong / GroundMoRe
View on GitHub
☆18May 18, 2026Updated 2 months ago
PolyU-ChenLab / UniPixel
View on GitHub
🔮 UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning (NeurIPS 2025)
☆247Jan 4, 2026Updated 6 months ago
ssocean / pybiblion
View on GitHub
Bibliometric. A Python framework designed for the analysis and evaluation of scholarly publications.
☆15Jan 16, 2026Updated 6 months ago
para-lost / ECHO
View on GitHub
Echo: "Constantly Improving Image Models Need Constantly Improving Benchmarks" (ICLR 2026)
☆20Jan 29, 2026Updated 5 months ago
letitiabanana / PnP-OVSS
View on GitHub
[CVPR'24] Code for Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models
☆18Jul 22, 2024Updated 2 years ago