leolee99 / PAU
The official implementation of paper "Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval" accepted by NeurIPS' 2023.
☆24Updated 11 months ago
Alternatives and similar repositories for PAU:
Users that are interested in PAU are comparing it to the libraries listed below
- [ICML 2024] "Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models"☆50Updated 7 months ago
- Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models (AAAI 2024)☆69Updated 2 months ago
- ☆25Updated last year
- Deep Evidential Learning with Noisy Correspondence for Cross-modal Retrieval ( ACM Multimedia 2022, Pytorch Code)☆43Updated last year
- ☆36Updated 2 years ago
- 【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval☆81Updated last year
- Context-I2W: Mapping Images to Context-dependent words for Accurate Zero-Shot Composed Image Retrieval [AAAI 2024 Oral]☆50Updated 5 months ago
- A simple pytorch implementation of baseline based-on CLIP for Image-text Matching.☆13Updated last year
- ☆10Updated last year
- The code of the paper "Negative Pre-aware for Noisy Cross-modal Matching" in AAAI 2024.☆22Updated 11 months ago
- VQACL: A Novel Visual Question Answering Continual Learning Setting (CVPR'23)☆36Updated last year
- ☆12Updated 3 months ago
- [NeurIPS 2024] Code for Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models☆38Updated last month
- [CVPR 2023] Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception☆35Updated last year
- ☆33Updated last year
- [AAAI 2024] GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval☆16Updated 11 months ago
- [NeurIPS 2024] Conjugated Semantic Pool Improves OOD Detection with Pre-trained Vision-Language Models☆35Updated 6 months ago
- ☆73Updated last year
- (ICML 2024) Improve Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning☆28Updated 6 months ago
- Code for Label Propagation for Zero-shot Classification with Vision-Language Models (CVPR2024)☆36Updated 9 months ago
- The code of "Image-text Retrieval via Preserving Main Semantic of Vision" in ICME 2023.☆14Updated last year
- ☆19Updated 9 months ago
- Pytorch implementation for "Erasing the Bias: Fine-Tuning Foundation Models for Semi-Supervised Learning" (ICML 2024)☆17Updated 3 months ago
- Official pytorch implementation of CVPR2023 paper "Learning Conditional Attributes for Compositional Zero-Shot Learning"☆16Updated 7 months ago
- USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval, TIP 2024☆31Updated last year
- Cross-modal Active Complementary Learning with Self-refining Correspondence (NeurIPS 2023, Pytorch Code)☆16Updated 10 months ago
- A comprehensive survey of Composed Multi-modal Retrieval (CMR), including Composed Image Retrieval (CIR) and Composed Video Retrieval (CV…☆28Updated last month
- Implementation of "DIME-FM: DIstilling Multimodal and Efficient Foundation Models"☆14Updated last year
- [TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”☆33Updated last year
- (CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning☆46Updated 8 months ago