liuting20 / MaPPERLinks
[EMNLP 2024 Main] MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression Comprehension
☆14Updated 7 months ago
Alternatives and similar repositories for MaPPER
Users that are interested in MaPPER are comparing it to the libraries listed below
Sorting:
- The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024☆45Updated 8 months ago
- Transactions on Multimedia (TMM25)☆15Updated 4 months ago
- [ICME 2024 Oral] DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding☆21Updated 5 months ago
- [AAAI-2025] The official code of Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation☆43Updated 2 months ago
- [ICCV-2023] The official code of Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation☆134Updated last month
- Official Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].☆31Updated 9 months ago
- [NeurIPS 2023] The official implementation of SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation☆32Updated last year
- The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024☆16Updated 10 months ago
- The official code for "TextRefiner: Internal Visual Feature as Efficient Refiner for Vision-Language Models Prompt Tuning" | [AAAI2025]☆42Updated 5 months ago
- This repo is the official pytorch implementation of the paper: CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-V…☆34Updated 7 months ago
- This repo holds the official code and data for "Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentati…☆70Updated last year
- Official implement of ICML2024 Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation☆51Updated 11 months ago
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆18Updated last year
- [ECCV2024]The official implementation of the DiffPNG paper in PyTorch.☆12Updated 9 months ago
- [ICCV25 Oral] Token Activation Map to Visually Explain Multimodal LLMs☆52Updated this week
- Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation☆52Updated 2 months ago
- [CVPR 2025 🔥]A Large Multimodal Model for Pixel-Level Visual Grounding in Videos☆76Updated 3 months ago
- [ICCV 2025] The official pytorch implement of "LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMs".☆15Updated last month
- [ICCV 2025] Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"☆42Updated 6 months ago
- A list of referring video object segmentation papers☆46Updated 2 months ago
- [CVPR 2025 Highlight] Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding☆23Updated last month
- [CVPR 2025] RAP: Retrieval-Augmented Personalization☆64Updated 2 weeks ago
- Official implementation of ResCLIP: Residual Attention for Training-free Dense Vision-language Inference☆43Updated 4 months ago
- The official implementation of our paper ''IteRPrimE: Zero-shot Referring Image Segmentation with Iterative Grad-CAM Refinement and Prima…☆14Updated 4 months ago
- [ICCV 2025] Official PyTorch Code for "Advancing Textual Prompt Learning with Anchored Attributes"☆86Updated 3 weeks ago
- [CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding