letitiabanana / PnP-OVSS
[CVPR'24] Code for Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models
☆10Updated last month
Related projects: ⓘ
- ☆12Updated 3 weeks ago
- ☆21Updated last year
- ☆32Updated last year
- Vision Relation Transformer for Unbiased Scene Graph Generation (ICCV 2023)☆21Updated 11 months ago
- ☆12Updated 2 months ago
- Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆35Updated last month
- CVPR2022 - Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation☆22Updated 2 years ago
- [ICML2024]The official implementation of SemiRES in PyTorch.☆18Updated 3 months ago
- Towards a Unified View on Visual Parameter-Efficient Transfer Learning☆26Updated last year
- [WACV 2024] Instruct Me More! Random Prompting for Visual In-Context Learning☆13Updated 5 months ago
- The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024☆9Updated last week
- ☆16Updated last year
- ☆10Updated last year
- Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation☆35Updated last year
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆36Updated last year
- [CVPR 2023] Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners☆35Updated last year
- Official implementation of TagAlign☆31Updated 5 months ago
- [AAAI2024] Code Release of CLIM: Contrastive Language-Image Mosaic for Region Representation☆25Updated 7 months ago
- This is the official code of "Uncovering Prototypical Knowledge for Weakly Open-Vocabulary Semantic Segmentation, NeurIPS 23"☆22Updated 9 months ago
- ☆12Updated 9 months ago
- ☆32Updated 10 months ago
- ECCV24 "ReMamber: Referring Image Segmentation with Mamba Twister" official repository.☆10Updated 2 months ago
- CVPR2024: Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models☆54Updated 2 months ago
- RefTeacher is a strong baseline method for Semi-Supervised Referring Expression Comprehension.☆12Updated last year
- Lightweight Transformer for Multi-modal Tasks☆15Updated last year
- [CVPR-2023] The official dataset of Advancing Visual Grounding with Scene Knowledge: Benchmark and Method.☆28Updated last year
- [ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference☆37Updated last month
- [ICML 2024] Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning☆41Updated 4 months ago
- ☆17Updated last year
- Official Codes for Fine-Grained Visual Prompting, NeurIPS 2023☆32Updated 7 months ago