LHL3341 / ContextBLIP
☆11Updated 10 months ago
Alternatives and similar repositories for ContextBLIP:
Users that are interested in ContextBLIP are comparing it to the libraries listed below
- ☆10Updated 3 months ago
- ☆20Updated 11 months ago
- ☆36Updated 2 years ago
- [CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention☆28Updated 8 months ago
- Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models (AAAI 2024)☆68Updated 2 months ago
- 【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval☆80Updated 11 months ago
- (CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning☆45Updated 7 months ago
- [CVPRW-25 MMFM] Official repository of paper titled "How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite fo…☆47Updated 7 months ago
- [CVPR 2024] Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension☆49Updated last year
- Official pytorch implementation of "RITUAL: Random Image Transformations as a Universal Anti-hallucination Lever in Large Vision Language…☆10Updated 3 months ago
- [ICLR 2025] See What You Are Told: Visual Attention Sink in Large Multimodal Models☆18Updated last month
- Source code of our AAAI 2024 paper "Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval"☆36Updated last year
- Composed Video Retrieval☆53Updated 11 months ago
- 🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".☆31Updated 3 weeks ago
- [CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompt…☆38Updated 3 months ago
- [NeurIPS 2023] Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization☆104Updated last year
- [ICML 2024] "Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models"☆50Updated 7 months ago
- ☆23Updated 2 years ago
- Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval (AAAI2025)"☆18Updated 2 months ago
- The PyTorch implementation for "DEAL: Disentangle and Localize Concept-level Explanations for VLMs" (ECCV 2024 Strong Double Blind)☆19Updated 5 months ago
- [CVPR 2025] RAP: Retrieval-Augmented Personalization☆40Updated 2 weeks ago
- [ICML2024] Repo for the paper `Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models'☆20Updated 3 months ago
- [CVPR 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding☆46Updated this week
- Official implementation of CVPR 2024 paper "Prompt Learning via Meta-Regularization".☆27Updated last month
- Context-I2W: Mapping Images to Context-dependent words for Accurate Zero-Shot Composed Image Retrieval [AAAI 2024 Oral]☆50Updated 4 months ago
- [SIGIR 2024] - Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image Retrieval☆33Updated 9 months ago
- NegCLIP.☆31Updated 2 years ago
- A comprehensive survey of Composed Multi-modal Retrieval (CMR), including Composed Image Retrieval (CIR) and Composed Video Retrieval (CV…☆24Updated last month
- Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning☆20Updated 7 months ago
- Official PyTorch code of GroundVQA (CVPR'24)☆59Updated 7 months ago