liyih / CSCLLinks
[CVPR 2025] Unleashing the Potential of Consistency Learning for Detecting and Grounding Multi-Modal Media Manipulation
☆22Updated 4 months ago
Alternatives and similar repositories for CSCL
Users that are interested in CSCL are comparing it to the libraries listed below
Sorting:
- [CVPR25 Highlight] A ChatGPT-Prompted Visual hallucination Evaluation Dataset, featuring over 100,000 data samples and four advanced eval…☆27Updated 7 months ago
- ☆75Updated 7 months ago
- Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval [CVPR 2025 Highlight]☆60Updated 4 months ago
- [ACM MM 2024] FKA-Owl: Advancing Multimodal Fake News Detection through Knowledge-Augmented LVLMs☆48Updated last year
- A comprehensive survey of Composed Multi-modal Retrieval (CMR), including Composed Image Retrieval (CIR) and Composed Video Retrieval (CV…☆68Updated 3 months ago
- SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection☆76Updated last year
- [ToMM2023] - AMC: Adaptive Multi-expert Collaborative Network for Text-guided Image Retrieval☆20Updated last year
- Codes of the Fine-grained Textual Inversion network for Zero-Shot Composed Image Retrieval☆26Updated 7 months ago
- [CVPR 2024] TeachCLIP for Text-to-Video Retrieval☆40Updated 6 months ago
- [AAAI 2024] DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval.☆44Updated last year
- [AAAI 2024 Oral] M2CLIP: A Multimodal, Multi-Task Adapting Framework for Video Action Recognition☆72Updated 11 months ago
- Uncertainty-Guided Noisy Correspondence Learning for Efficient Cross-Modal Matching (ACM SIGIR 2024, Pytorch Code)☆24Updated 9 months ago
- [CVPR 2024] Official repository of the paper "Uncovering What, Why and How: A Comprehensive Benchmark for Causation Understanding of Vid…☆85Updated 10 months ago
- The official implementation of "Cross-modal Causal Relation Alignment for Video Question Grounding. (CVPR 2025 Highlight)"☆38Updated 6 months ago
- [SIGIR 2024] - Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image Retrieval☆43Updated last year
- Code for MInD: Multimodal Information Disentanglement☆17Updated last year
- ☆20Updated last year
- [ICML2024] Official PyTorch implementation of CoMC: Language-Driven Cross-Modal Classifier for Zero-Shot Multi-Label Image Recognition☆16Updated last year
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆62Updated last year
- Code and Dataset for the paper "LAMM: Label Alignment for Multi-Modal Prompt Learning" AAAI 2024☆33Updated last year
- Code for paper "LLMs Can Evolve Continually on Modality for X-Modal Reasoning" NeurIPS2024☆40Updated 11 months ago
- [CVPR 2025] Official PyTorch Code for "DPC: Dual-Prompt Collaboration for Tuning Vision-Language Models"☆35Updated 2 months ago
- Official Implementation for MoPE (T-MM 2025)☆24Updated last month
- [TPAMI 2024] This is the official Pytorch code for our paper "Context Disentangling and Prototype Inheriting for Robust Visual Grounding"…☆26Updated 6 months ago
- [CVPR 2025] Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Att…☆52Updated last month
- [ICCV 2025] Official PyTorch Code for "Advancing Textual Prompt Learning with Anchored Attributes"☆106Updated this week
- Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval (AAAI2025)"☆24Updated 9 months ago
- [ICML 2024] "Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models"☆57Updated last year
- [AAAI'25]: Building a Multi-modal Spatiotemporal Expert for Zero-shot Action Recognition with CLIP☆17Updated 3 months ago
- Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)☆83Updated last year