π₯ [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"
β46Jun 16, 2024Updated last year
Alternatives and similar repositories for sesame
Users that are interested in sesame are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Rui Qian, Xin Yin, Chuanhang Deng, et al.: UGround: Towards Unified Visual Grounding with Unrolled Transformers (ICML 2026)β22Apr 30, 2026Updated last week
- Rui Qian, Xin Yin, Dejing Douβ : Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)β53Feb 4, 2026Updated 3 months ago
- π₯ [NeurIPS 2025] Official implementation of "Generate, but Verify: Reducing Visual Hallucination in Vision-Language Models with Retrospeβ¦β57Jan 22, 2026Updated 3 months ago
- Official Repository of VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agentsβ110Mar 10, 2026Updated last month
- β33Sep 27, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Modelsβ167Sep 12, 2024Updated last year
- Emergent Visual Grounding in Large Multimodal Models Without Grounding Supervisionβ44Oct 19, 2025Updated 6 months ago
- [CVPR2025] Code Release of F-LMM: Grounding Frozen Large Multimodal Modelsβ110May 29, 2025Updated 11 months ago
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Modelβ21Jul 20, 2024Updated last year
- [CVPR 2024] PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding.β267Feb 11, 2025Updated last year
- The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024β18Oct 11, 2024Updated last year
- This is official code for "Combating Label Distribution Shift for Active Domain Adaptation" accepted in ECCV2022β15Oct 25, 2022Updated 3 years ago
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Modelβ212Aug 5, 2024Updated last year
- π₯ [ICLR 2025] Official Benchmark Toolkits for "Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark"β41Nov 21, 2025Updated 5 months ago
- Proton VPN Special Offer - Get 70% off β’ AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- [ICLR2025] Text4Seg: Reimagining Image Segmentation as Text Generationβ171Nov 8, 2025Updated 6 months ago
- π₯ [ICLR 2025] Official PyTorch Model "Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark"β26Feb 9, 2025Updated last year
- Open-vocabulary Semantic Segmentationβ33Feb 16, 2024Updated 2 years ago
- Benchmarking Video-LLMs on Video Spatio-Temporal Reasoningβ43Mar 2, 2026Updated 2 months ago
- T2I-Copilot: A Training-Free Multi-Agent Text-to-Image System for Enhanced Prompt Interpretation and Interactive Generation (ICCV'25)β51Oct 6, 2025Updated 7 months ago
- [ECCV 2024] SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentationβ52Mar 20, 2025Updated last year
- Video Reasoning Segmentationβ27Nov 29, 2024Updated last year
- β21Feb 10, 2025Updated last year
- [CVPR 2024 π₯] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses thaβ¦β953Aug 5, 2025Updated 9 months ago
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [CVPR2025] Project for "HyperSeg: Towards Universal Visual Segmentation with Large Language Model".β180Dec 13, 2024Updated last year
- Training, optimization and deployment of Object Detection model with dinov2 backbone for efficient inference on NVIDIA Jetsonβ13Jul 26, 2025Updated 9 months ago
- [ICCV 2023] CTVIS: Consistent Training for Online Video Instance Segmentationβ82Oct 15, 2023Updated 2 years ago
- The code of 'The devil is in the labels: Semantic segmentation from sentences'.β13Nov 13, 2022Updated 3 years ago
- [AAAI 2024] The official implementation of the paper "3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referβ¦β45Dec 20, 2023Updated 2 years ago
- Code for the paper "Benchmarking Object Detectors with COCO: A New Path Forward."β33Jul 13, 2024Updated last year
- [NeurIPS 2025] HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generationβ78Sep 19, 2025Updated 7 months ago
- β39Mar 5, 2026Updated 2 months ago
- [NeurIPS 2024] Official implementation of the paper "Interfacing Foundation Models' Embeddings"β132Aug 21, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- β16Dec 9, 2023Updated 2 years ago
- LLM-Seg: Bridging Image Segmentation and Large Language Model Reasoningβ193Apr 16, 2024Updated 2 years ago
- β15Apr 6, 2026Updated last month
- [ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"β270Dec 30, 2024Updated last year
- Benchmark for Anomaly Detection in Semantic Segmentationβ12Feb 27, 2026Updated 2 months ago
- [ICCV 2025] TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generationβ40Nov 27, 2024Updated last year
- [ICLR'26] Official PyTorch implementation of "Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models".β64Mar 5, 2026Updated 2 months ago