xuyang-liu16 / VGDiffZero
[ICASSP 2024] VGDiffZero: Text-to-image Diffusion Models Can Be Zero-shot Visual Grounders
☆15Updated last month
Alternatives and similar repositories for VGDiffZero:
Users that are interested in VGDiffZero are comparing it to the libraries listed below
- ☆25Updated 9 months ago
- OVMR: Open-Vocabulary Recognition with Multi-Modal References (CVPR24)☆25Updated 4 months ago
- The official implementation of "Adapter is All You Need for Tuning Visual Tasks".☆97Updated 3 weeks ago
- Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference".☆83Updated 3 weeks ago
- [ICME 2024 Oral] DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding☆20Updated last month
- Official implementation for "Diffusion Model is Secretly a Training-free Open Vocabulary Semantic Segmenter"☆35Updated last year
- The repository contains the official implementation of "Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation"☆39Updated 3 weeks ago
- [AAAI-2025] The offical code for SiTo (Similarity-based Token Pruning for Stable Diffusion Models)☆22Updated 2 months ago
- [NeurIPS2024]☆19Updated 3 months ago
- ☆53Updated last week
- [ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference☆77Updated this week
- [NeurIPS 2024] Visual Perception by Large Language Model’s Weights☆41Updated 5 months ago
- ☆18Updated 2 months ago
- Official PyTorch implementation of GeoDiffusion in ICLR 2024 (https://arxiv.org/abs/2306.04607)☆77Updated 2 months ago
- Official implement of ICML2024 Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation☆48Updated 7 months ago
- MADAv2: Advanced Multi-Anchor Based Active Domain Adaptation Segmentation☆25Updated last year
- Text-Image Alignment for Diffusion-based Perception (TADP) - CVPR 2024☆31Updated 7 months ago
- [CVPR 2024] The repository contains the official implementation of "Open-Vocabulary Segmentation with Semantic-Assisted Calibration"☆69Updated 6 months ago
- [ICCV 2023] Generative Prompt Model for Weakly Supervised Object Localization☆57Updated last year
- [NIPS24] Official Implementation of Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation☆18Updated 4 months ago
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆67Updated 5 months ago
- [CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models☆34Updated last month
- [ICLR 2025] Reconstructive Visual Instruction Tuning☆73Updated 3 weeks ago
- Official repository of the paper "High-Quality Mask Tuning Matters for Open-Vocabulary Segmentation"☆21Updated this week
- [ECCV2024]The official implementation of the DiffPNG paper in PyTorch.☆11Updated 5 months ago
- p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay☆32Updated 2 months ago
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation☆53Updated last week
- MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer☆41Updated 6 months ago
- Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model☆21Updated 2 months ago
- cliptrase☆34Updated 6 months ago