leolee99 / PAULinks

The official implementation of paper "Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval" accepted by NeurIPS' 2023.

☆27

Alternatives and similar repositories for PAU

Users that are interested in PAU are comparing it to the libraries listed below

Sorting:

tmlr-group / WCA
[ICML 2024] "Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models"
☆57Updated last year
ThomasWangY / 2024-AAAI-HPT
Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models (AAAI 2024)
☆73Updated 9 months ago
xu5zhao / BiCro
☆27Updated 2 years ago
chunmeifeng / SPRC
【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval
☆90Updated last year
IIGROUP / MAP
☆37Updated 3 years ago
QinYang79 / DECL
Deep Evidential Learning with Noisy Correspondence for Cross-modal Retrieval ( ACM Multimedia 2022, Pytorch Code)
☆47Updated last year
XLearning-SCU / 2021-NeurIPS-NCR
☆77Updated 2 years ago
ZhangXu0963 / NPC
The code of the paper "Negative Pre-aware for Noisy Cross-modal Matching" in AAAI 2024.
☆23Updated 4 months ago
Pter61 / context-i2w
Context-I2W: Mapping Images to Context-dependent words for Accurate Zero-Shot Composed Image Retrieval [AAAI 2024 Oral]
☆55Updated 5 months ago
huangmozhi9527 / GMMFormer
[AAAI 2024] GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval
☆20Updated last year
zhangxi1997 / VQACL
VQACL: A Novel Visual Question Answering Continual Learning Setting (CVPR'23)
☆41Updated last year
joeyz0z / MeaCap
(CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning
☆53Updated last year
ailab-kyunghee / CM2_DVC
[CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval
☆63Updated last year
GingL / CMPA
☆16Updated 2 years ago
kkzhang95 / Awesome-Composed-Multi-modal-Retrieval
A comprehensive survey of Composed Multi-modal Retrieval (CMR), including Composed Image Retrieval (CIR) and Composed Video Retrieval (CV…
☆68Updated 3 months ago
haokunwen / DQU-CIR
[SIGIR 2024] - Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image Retrieval
☆43Updated last year
KevinLight831 / CTP
[ICCV2023] - CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation
☆38Updated last year
CHENGY12 / PLOT
[ICLR2023] PLOT: Prompt Learning with Optimal Transport for Vision-Language Models
☆171Updated last year
taewhankim / VIPCAP
☆14Updated 10 months ago
TalalWasim / Vita-CLIP
Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]
☆127Updated 2 years ago
MengyuanChen21 / NeurIPS2024-CSP
[NeurIPS 2024] Conjugated Semantic Pool Improves OOD Detection with Pre-trained Vision-Language Models
☆39Updated last year
kaipengfang / ProS
☆20Updated last year
ivonajdenkoska / multimodal-meta-learn
[ICLR 2023] Official code repository for "Meta Learning to Bridge Vision and Language Models for Multimodal Few-Shot Learning"
☆60Updated 2 years ago
zhangy0822 / USER
USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval, TIP 2024
☆33Updated 5 months ago
sunxm2357 / DIME-FM
Implementation of "DIME-FM: DIstilling Multimodal and Efficient Foundation Models"
☆15Updated 2 years ago
jameelhassan / PromptAlign
[NeurIPS 2023] Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization
☆108Updated last year
Jiaxuan-Li / EVCap
[CVPR 2024] Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension
☆59Updated last year
guozix / TaI-DPT
☆94Updated 2 years ago
bladewaltz1 / PromptSwitch
☆30Updated 2 years ago
jpthu17 / HBI
[CVPR 2023 Highlight & TPAMI] Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning
☆122Updated 10 months ago