tuyunbin / Review-of-Change-CaptioningLinks
This repository offers a comprehensive overview of existing datasets and methods in the field of change captioning.
☆16Updated 2 months ago
Alternatives and similar repositories for Review-of-Change-Captioning
Users that are interested in Review-of-Change-Captioning are comparing it to the libraries listed below
Sorting:
- [TPAMI 2024] This is the official Pytorch code for our paper "Context Disentangling and Prototype Inheriting for Robust Visual Grounding"…☆24Updated 5 months ago
- [TPAMI 2024] Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding☆30Updated last year
- [CVPR 2024] TeachCLIP for Text-to-Video Retrieval☆40Updated 5 months ago
- Source code of our AAAI 2024 paper "Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval"☆48Updated last year
- The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024☆47Updated 3 weeks ago
- Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval [CVPR 2025 Highlight]☆59Updated 3 months ago
- [ACM MM 2024] Hierarchical Multimodal Fine-grained Modulation for Visual Grounding.☆54Updated 2 months ago
- This is a summary of research on noisy correspondence. There may be omissions. If anything is missing please get in touch with us. Our em…☆72Updated 2 weeks ago
- [CVPR2025] Code Release of Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception☆15Updated 4 months ago
- 【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval☆89Updated last year
- The official code for "TextRefiner: Internal Visual Feature as Efficient Refiner for Vision-Language Models Prompt Tuning" | [AAAI2025]☆45Updated 7 months ago
- Composed Video Retrieval☆61Updated last year
- (CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning☆51Updated last year
- ☆35Updated last year
- [BMVC 2023] Zero-shot Composed Text-Image Retrieval☆54Updated 11 months ago
- A Benchmark and Awesome Collection of Methods for Remote Sensing Image-Text Retrieval (RSITR)| Remote Sensing Cross-Model Retrieval (RSCM…☆62Updated 7 months ago
- This repo is the official implementation of "Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning" accepted by AA…☆53Updated last month
- This is a summary of research on noisy correspondence. There may be omissions. If anything is missing please get in touch with us. Our em…☆101Updated 2 weeks ago
- The official implementation of "Cross-modal Causal Relation Alignment for Video Question Grounding. (CVPR 2025 Highlight)"☆37Updated 6 months ago
- A comprehensive survey of Composed Multi-modal Retrieval (CMR), including Composed Image Retrieval (CIR) and Composed Video Retrieval (CV…☆58Updated 2 months ago
- Code for paper "LLMs Can Evolve Continually on Modality for X-Modal Reasoning" NeurIPS2024☆38Updated 10 months ago
- 【AAAI2025】DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification☆64Updated 7 months ago
- [ICCV 2023] This is the Pytorch code for our paper "Self-Supervised Cross-View Representation Reconstruction for Change Captioning".☆20Updated last month
- Pytorch implementation of "Test-time Adaption against Multi-modal Reliability Bias".☆43Updated 10 months ago
- [CVPR 2025] Understanding Fine-tuning CLIP for Open-vocabulary Semantic Segmentation in Hyperbolic Space☆25Updated 3 months ago
- [TMM 2023] Self-paced Curriculum Adapting of CLIP for Visual Grounding.☆129Updated 2 months ago
- The repo for "MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance", ICML 2024☆47Updated last year
- Pytorch Code for "Unified Coarse-to-Fine Alignment for Video-Text Retrieval" (ICCV 2023)☆66Updated last year
- [CVPR2025] Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models☆16Updated 6 months ago
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆61Updated last year