yan9qu / IntCLIPLinks
Repo for "Synergy of Sight and Semantics: Visual Intention Understanding with CLIP"
☆12Updated 2 months ago
Alternatives and similar repositories for IntCLIP
Users that are interested in IntCLIP are comparing it to the libraries listed below
Sorting:
- [CVPR 2024 Accepted] TaskWeave: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection☆25Updated 8 months ago
- [ECCV 2024 oral] -C2C: Component-to-Composition Learning for Zero-Shot Compositional Action Recognition☆33Updated 6 months ago
- ☆63Updated 2 months ago
- [CVPR 2025] PyTorch implementation of T-CORE, introduced in "When the Future Becomes the Past: Taming Temporal Correspondence for Self-su…☆11Updated 2 months ago
- Composed Person Retrieval (CPR) is a new cross-modal retrieval task that aims to identify individuals in large-scale person image databas…☆24Updated last week
- Official Code for Composite Sketch+Text Queries for Retrieving Objects with Elusive Names and Complex Interactions☆15Updated last year
- Official pytorch repository for "TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection" (AAAI 2024 Pape…☆48Updated 3 months ago
- Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]☆118Updated last year
- CVPR2024: Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models☆73Updated 11 months ago
- ☆16Updated last year
- [CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval☆57Updated 11 months ago
- [AAAI 2023] Official repository of "Progressive Few-Shot Adaptation of Generative Model with Align-Free Spatial Correlation"☆10Updated last year
- The official repository for ICLR2024 paper "FROSTER: Frozen CLIP is a Strong Teacher for Open-Vocabulary Action Recognition"☆79Updated 4 months ago
- Pytorch Code for "Unified Coarse-to-Fine Alignment for Video-Text Retrieval" (ICCV 2023)☆65Updated last year
- [CVPR 2024] LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion.☆46Updated 4 months ago
- The official pytorch implemention of our CVPR-2024 paper "MMA: Multi-Modal Adapter for Vision-Language Models".☆69Updated last month
- CLIP-Driven Fine-grained Text-Image Person Re-identification☆49Updated last year
- Context-I2W: Mapping Images to Context-dependent words for Accurate Zero-Shot Composed Image Retrieval [AAAI 2024 Oral]☆52Updated last week
- [NeurIPS 2023] The official implementation of SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation☆31Updated last year
- Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models (AAAI 2024)☆74Updated 4 months ago
- Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval [CVPR 2025 Highlight]☆40Updated 2 months ago
- Improving Mamaba performance on Video Understanding task☆40Updated 7 months ago
- Code for our IJCV 2023 paper "CLIP-guided Prototype Modulating for Few-shot Action Recognition".☆64Updated last year
- CVPR 2023 Accepted Paper HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models☆65Updated last year
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆54Updated 11 months ago
- ☆39Updated last year
- Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval (AAAI2025)"☆21Updated 4 months ago
- ☆13Updated 3 months ago
- ☆25Updated 9 months ago
- [ICCV'2023 Oral] Implicit Temporal Modeling with Learnable Alignment for Video Recognition☆35Updated last year