see-say-segment/sesame

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/see-say-segment/sesame)

see-say-segment / sesame

🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"

☆47

Alternatives and similar repositories for sesame

Users that are interested in sesame are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

berkeley-hipie / segllm
View on GitHub
Code release for "SegLLM: Multi-round Reasoning Segmentation"
☆129Feb 20, 2025Updated last year
rui-qian / UGround
View on GitHub
Rui Qian, Xin Yin, Chuanhang Deng, et al.: UGround: Towards Unified Visual Grounding with Unrolled Transformers (ICML 2026)
☆29Jun 18, 2026Updated last month
para-lost / ECHO
View on GitHub
Echo: "Constantly Improving Image Models Need Constantly Improving Benchmarks" (ICLR 2026)
☆20Jan 29, 2026Updated 6 months ago
rui-qian / READ
View on GitHub
Rui Qian, Xin Yin, Dejing Dou†: Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)
☆54Feb 4, 2026Updated 5 months ago
lizhou-cs / mglmm
View on GitHub
☆32Jun 14, 2026Updated last month
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
LeapLabTHU / GSVA
View on GitHub
[CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models
☆166Sep 12, 2024Updated last year
Shengcao-Cao / groundLMM
View on GitHub
Emergent Visual Grounding in Large Multimodal Models Without Grounding Supervision
☆47Oct 19, 2025Updated 9 months ago
wusize / F-LMM
View on GitHub
[CVPR2025] Code Release of F-LMM: Grounding Frozen Large Multimodal Models
☆115May 29, 2025Updated last year
tsunghan-wu / reverse_vlm
View on GitHub
🔥 [NeurIPS 2025] Official implementation of "Generate, but Verify: Reducing Visual Hallucination in Vision-Language Models with Retrospe…
☆58Jan 22, 2026Updated 6 months ago
cilinyan / ReVOS-api
View on GitHub
[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model
☆22Jul 20, 2024Updated 2 years ago
MaverickRen / PixelLM
View on GitHub
[CVPR 2024] PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding.
☆273Feb 11, 2025Updated last year
sehyun03 / ADA-label-distribution-matching
View on GitHub
This is official code for "Combating Label Distribution Shift for Active Domain Adaptation" accepted in ECCV2022
☆15Oct 25, 2022Updated 3 years ago
GeWu-Lab / Stepping-Stones
View on GitHub
The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024
☆18Oct 11, 2024Updated last year
cilinyan / VISA
View on GitHub
[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model
☆214Aug 5, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
visual-haystacks / mirage
View on GitHub
🔥 [ICLR 2025] Official PyTorch Model "Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark"
☆27Feb 9, 2025Updated last year
CongHan0808 / DeOP
View on GitHub
Open-vocabulary Semantic Segmentation
☆33Feb 16, 2024Updated 2 years ago
rkzheng99 / ViLLa
View on GitHub
Video Reasoning Segmentation
☆26Nov 29, 2024Updated last year
congvvc / HyperSeg
View on GitHub
[CVPR2025] Project for "HyperSeg: Towards Universal Visual Segmentation with Large Language Model".
☆183Dec 13, 2024Updated last year
model-similarity / lm-similarity
View on GitHub
☆21Feb 10, 2025Updated last year
GeWu-Lab / Ref-AVS
View on GitHub
The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024
☆50Oct 12, 2025Updated 9 months ago
MCG-NKU / ExperiCV
View on GitHub
Initial code for computer vision experiments
☆11Jan 1, 2023Updated 3 years ago
dgcnz / edge
View on GitHub
Training, optimization and deployment of Object Detection model with dinov2 backbone for efficient inference on NVIDIA Jetson
☆14Jul 26, 2025Updated last year
mbzuai-oryx / groundingLMM
View on GitHub
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses tha…
☆964Aug 5, 2025Updated 11 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
baoxiaoyi / CoReS
View on GitHub
code for the paper "CoReS: Orchestrating the Dance of Reasoning and Segmentation"
☆23Nov 24, 2025Updated 8 months ago
KainingYing / CTVIS
View on GitHub
[ICCV 2023] CTVIS: Consistent Training for Online Video Instance Segmentation
☆83Oct 15, 2023Updated 2 years ago
GeWu-Lab / Generalizable-Audio-Visual-Segmentation
View on GitHub
Official repository of "Prompting Segmentation with Sound is Generalizable Audio-Visual Source Localizer", AAAI 2024
☆28Mar 14, 2026Updated 4 months ago
irfanICMLL / SSIW
View on GitHub
The code of 'The devil is in the labels: Semantic segmentation from sentences'.
☆13Nov 13, 2022Updated 3 years ago
sosppxo / 3D-STMN
View on GitHub
[AAAI 2024] The official implementation of the paper "3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Refer…
☆45Dec 20, 2023Updated 2 years ago
Haochen-Wang409 / TreeVGR
View on GitHub
[ICLR'26] Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology
☆91Jan 26, 2026Updated 6 months ago
gongda0e / KARI
View on GitHub
Activity Grammars for Temporal Action Segmentation (NeurIPS 2023)
☆14Jun 14, 2024Updated 2 years ago
Gen-Verse / HermesFlow
View on GitHub
[NeurIPS 2025] HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
☆77Sep 19, 2025Updated 10 months ago
UX-Decoder / FIND
View on GitHub
[NeurIPS 2024] Official implementation of the paper "Interfacing Foundation Models' Embeddings"
☆132Aug 21, 2024Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
mc-lan / Text4Seg
View on GitHub
[ICLR2025] Text4Seg: Reimagining Image Segmentation as Text Generation
☆177Nov 8, 2025Updated 8 months ago
wangjunchi / LLMSeg
View on GitHub
LLM-Seg: Bridging Image Segmentation and Large Language Model Reasoning
☆194Apr 16, 2024Updated 2 years ago
visual-haystacks / vhs_benchmark
View on GitHub
🔥 [ICLR 2025] Official Benchmark Toolkits for "Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark"
☆44Nov 21, 2025Updated 8 months ago
zamling / PSALM
View on GitHub
[ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"
☆269Dec 30, 2024Updated last year
hermannsblum / fishyscapes
View on GitHub
Benchmark for Anomaly Detection in Semantic Segmentation
☆12Feb 27, 2026Updated 5 months ago
yujunwei04 / UnSAMv2
View on GitHub
Code release for "UnSAMv2: Self-Supervised Learning Enables Segment Anything at Any Granularity"
☆82Feb 1, 2026Updated 5 months ago
Hon-Wong / ByteVideoLLM
View on GitHub
[ICCV 2025] Dynamic-VLM
☆28Dec 16, 2024Updated last year