frank-chris / ImageTextRetrievalLinks
In this work, we implement different cross-modal learning schemes such as Siamese Network, Correlational Network and Deep Cross-Modal Projection Learning model and study their performance. We also propose a modified Deep Cross-Modal Projection Learning model that uses a different image feature extractor. We evaluate the model’s performance on im…
☆11Updated 3 years ago
Alternatives and similar repositories for ImageTextRetrieval
Users that are interested in ImageTextRetrieval are comparing it to the libraries listed below
Sorting:
- ☆22Updated 3 years ago
- Implementation of our paper, 'Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval.'☆25Updated last year
- Repository of paper Consistency-preserving Visual Question Answering in Medical Imaging (MICCAI2022)☆23Updated 2 years ago
- [TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”☆33Updated last year
- Paper reading notes in the field of Image-Text Matching/Retrieval.☆13Updated 3 years ago
- ☆27Updated 2 years ago
- ☆19Updated 2 years ago
- ☆16Updated 3 years ago
- [ICCVW2023] Robust Asymmetric Loss for Multi-Label Long-Tailed Learning☆18Updated last year
- [ACL 2021] Learning Relation Alignment for Calibrated Cross-modal Retrieval☆31Updated 2 years ago
- ☆48Updated last year
- Using image captions with LLM for zero-shot VQA☆18Updated last year
- PyTorch code of our KG-SP method for Compositional Zero-Shot Learning☆12Updated last year
- Official Implementation of "Geometric Multimodal Contrastive Representation Learning" (https://arxiv.org/abs/2202.03390)☆28Updated 6 months ago
- SimVLM ---SIMPLE VISUAL LANGUAGE MODEL PRETRAINING WITH WEAK SUPERVISION☆36Updated 2 years ago
- ☆18Updated 3 years ago
- [ACLW'24] LMPT: Prompt Tuning with Class-Specific Embedding Loss for Long-tailed Multi-Label Visual Recognition☆52Updated 11 months ago
- ☆34Updated 3 years ago
- ☆17Updated 3 years ago
- Multi-label Image Recognition with Partial Labels (IJCV'24, ESWA'24, AAAI'22)☆40Updated last year
- Bounding and Filling: A Fast and Flexible Framework for Image Captioning☆9Updated last year
- Official repository for ACCV 2020 paper 'Class-Wise Difficulty-Balanced Loss for Solving Class-Imbalance'☆18Updated 4 years ago
- The source code of "Teacher-Student Learning: Efficient Hierarchical Message Aggregation Hashing for Cross-Modal Retrieval." (Accepted by…☆19Updated 3 years ago
- CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations, ICCV 2021☆64Updated 3 years ago
- offical implementation of "Calibrating Multimodal Learning" on ICML 2023☆20Updated 2 years ago
- Summary of Related Research on Image-Text Matching☆69Updated 2 years ago
- Energy-based Out-of-distribution Detection☆16Updated 4 years ago
- ☆62Updated 4 years ago
- A curated list of vision-and-language pre-training (VLP). :-)☆59Updated 3 years ago
- Source code for the paper "A Medical Semantic-Assisted Transformer for Radiographic Report Generation"☆23Updated 2 years ago