VoyageWang / IteRPrimEView external linksLinks
The official implementation of our paper ''IteRPrimE: Zero-shot Referring Image Segmentation with Iterative Grad-CAM Refinement and Primary Word Emphasis''
☆20Apr 6, 2025Updated 10 months ago
Alternatives and similar repositories for IteRPrimE
Users that are interested in IteRPrimE are comparing it to the libraries listed below
Sorting:
- [CVPR 2025] Hybrid Global-Local Representation with Augmented Spatial Guidance for Zero-Shot Referring Image Segmentation☆31Jun 27, 2025Updated 7 months ago
- [TIP 2025] Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation☆58Dec 22, 2025Updated last month
- [CVPR'2022, TPAMI'2024] LAVT: Language-Aware Vision Transformer for Referring Segmentation☆24Jan 21, 2025Updated last year
- This repo is the official pytorch implementation of the paper: CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-V…☆40Sep 10, 2025Updated 5 months ago
- Official PyTorch implementation of GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting☆101Apr 3, 2025Updated 10 months ago
- [NeurIPS'24]Efficient and accurate memory saving method towards W4A4 large multi-modal models.☆97Jan 3, 2025Updated last year
- [CVPR 2025] Fine-Grained Image-Text Correspondence with Cost Aggregation for Open-Vocabulary Part Segmentation☆22Nov 17, 2025Updated 3 months ago
- The official code of "Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning"☆81Oct 15, 2025Updated 4 months ago
- [CVPR 2023] Official code for "Zero-shot Referring Image Segmentation with Global-Local Context Features"☆128Mar 17, 2025Updated 11 months ago
- This is an official PyTorch implementation of ASDA (accepted by ACMMM 2024).☆26Oct 22, 2024Updated last year
- The repository contains the official implementation of "DPMesh: Exploiting Diffusion Prior for Occluded Human Mesh Recovery"☆25Jul 25, 2024Updated last year
- Official code for Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions (CVPR 2024)☆28Jun 21, 2024Updated last year
- ☆35Mar 14, 2024Updated last year
- [CVPR 2024] The repository contains the official implementation of "Open-Vocabulary Segmentation with Semantic-Assisted Calibration"☆76Sep 23, 2024Updated last year
- Video as Conditional Graph Hierarchy for Multi-Granular Question Answering (AAAI'22, Oral)☆34Sep 17, 2022Updated 3 years ago
- [CVPR2024] Open-Vocabulary Semantic Segmentation with Image Embedding Balancing☆40Jan 12, 2026Updated last month
- ☆21May 18, 2025Updated 8 months ago
- The repository of VG-Refiner paper☆17Dec 9, 2025Updated 2 months ago
- A vision-language model with bidirectional progressive fusion and global-local alignment for enhanced medical image segmentation.☆17Dec 25, 2025Updated last month
- ☆21Dec 11, 2025Updated 2 months ago
- [ICCV 2025] Code for Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene Reconstruction☆169Dec 15, 2025Updated 2 months ago
- The implementation codes of paper: Multimodal Sentiment Analysis with Mutual Information-based Disentangled Representation Learning☆18May 8, 2025Updated 9 months ago
- [ACM MM2024] The code for HMLLM.☆11Oct 27, 2024Updated last year
- SpeedVision is an AI-powered tool that detects and calculates vehicle speed from video footage using YOLO-based object detection and fram…☆10Sep 22, 2024Updated last year
- ☆10Mar 17, 2023Updated 2 years ago
- ECCV24 "ReMamber: Referring Image Segmentation with Mamba Twister" official repository.☆44Jul 11, 2024Updated last year
- ☆13Sep 8, 2024Updated last year
- ☆12Dec 17, 2024Updated last year
- ☆13Jul 8, 2024Updated last year
- ☆12Sep 27, 2023Updated 2 years ago
- Official implementation of the paper "M3CoTBench: Benchmark Chain-of-Thought of MLLMs in Medical Image Understanding"☆20Jan 14, 2026Updated last month
- ☆13Apr 10, 2025Updated 10 months ago
- Official repository for Scone (Subject-driven Composition and Distinction Enhancement) model, designed to support multi-subject compositi…☆28Jan 14, 2026Updated last month
- Interpreting CLIP with Hierarchical Sparse Autoencoders (ICML 2025)☆19Jan 17, 2026Updated last month
- Hypergraph Vision Transformers: Images are More than Nodes, More than Edges☆17Jul 25, 2025Updated 6 months ago
- [ECCV 2024] "REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models"☆13Aug 6, 2024Updated last year
- Official codebase for FACMIC: Federated Adaptative CLIP Model for Medical Image Classification (Accepted at MICCAI 2024)☆14Jun 21, 2024Updated last year
- cliptrase☆47Sep 1, 2024Updated last year
- [AAAI-2025] The official code of Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation☆67May 21, 2025Updated 8 months ago