frank-chris / ImageTextRetrievalLinks
In this work, we implement different cross-modal learning schemes such as Siamese Network, Correlational Network and Deep Cross-Modal Projection Learning model and study their performance. We also propose a modified Deep Cross-Modal Projection Learning model that uses a different image feature extractor. We evaluate the model’s performance on im…
☆11Updated 4 years ago
Alternatives and similar repositories for ImageTextRetrieval
Users that are interested in ImageTextRetrieval are comparing it to the libraries listed below
Sorting:
- [TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”☆33Updated last year
- Using image captions with LLM for zero-shot VQA☆18Updated last year
- ☆16Updated 3 years ago
- ☆28Updated 2 years ago
- ☆22Updated 3 years ago
- ☆63Updated 4 years ago
- Implementation of our paper, 'Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval.'☆25Updated last year
- This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have b…☆79Updated 3 months ago
- The source code of "Teacher-Student Learning: Efficient Hierarchical Message Aggregation Hashing for Cross-Modal Retrieval." (Accepted by…☆19Updated 3 years ago
- ☆49Updated 2 years ago
- [ACL 2021] Learning Relation Alignment for Calibrated Cross-modal Retrieval☆31Updated 2 years ago
- ☆46Updated 3 years ago
- ☆18Updated 3 years ago
- CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations, ICCV 2021☆64Updated 3 years ago
- Repository of paper Consistency-preserving Visual Question Answering in Medical Imaging (MICCAI2022)☆23Updated 2 years ago
- Multi-label Image Recognition with Partial Labels (IJCV'24, ESWA'24, AAAI'22)☆40Updated last year
- Paper list of compositional zero-shot learning☆10Updated 3 years ago
- CLIP4IDC: CLIP for Image Difference Captioning (AACL 2022)☆34Updated 2 years ago
- SimVLM ---SIMPLE VISUAL LANGUAGE MODEL PRETRAINING WITH WEAK SUPERVISION☆36Updated 2 years ago
- Official code release for ARTEMIS: Attention-based Retrieval with Text-Explicit Matching and Implicit Similarity (published at ICLR 2022)☆52Updated 2 years ago
- The official code for "Visual Relationship Detection with Visual-Linguistic Knowledge from Multimodal Representations" (IEEE Access, 2021…☆17Updated 2 years ago
- The official implementation of 'Align and Attend: Multimodal Summarization with Dual Contrastive Losses' (CVPR 2023)☆78Updated 2 years ago
- ☆15Updated 2 years ago
- A curated list of vision-and-language pre-training (VLP). :-)☆59Updated 3 years ago
- Hate-CLIPper: Multimodal Hateful Meme Classification with Explicit Cross-modal Interaction of CLIP features - Accepted at EMNLP 2022 Work…☆53Updated 5 months ago
- Source code of our AAAI 2024 paper "Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval"☆47Updated last year
- [Paper][AAAI 2023] DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning☆52Updated last year
- ☆20Updated 2 years ago
- [COLING'25] HGCLIP: Exploring Vision-Language Models with Graph Representations for Hierarchical Understanding☆42Updated 9 months ago
- Implementation of our AAAI2022 paper, Show Your Faith: Cross-Modal Confidence-Aware Network for Image-Text Matching.☆36Updated 2 years ago