Paranioar / Awesome_Matching_Pretraining_TransferingLinks

The Paper List of Large Multi-Modality Model (Perception, Generation, Unification), Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.

☆435

Alternatives and similar repositories for Awesome_Matching_Pretraining_Transfering

Users that are interested in Awesome_Matching_Pretraining_Transfering are comparing it to the libraries listed below

Sorting:

Paranioar / SGRAF
[AAAI2021] The code of “Similarity Reasoning and Filtration for Image-Text Matching”
☆219Updated last year
uta-smile / TCL
code for TCL: Vision-Language Pre-Training with Triple Contrastive Learning, CVPR 2022
☆267Updated last year
woodfrog / vse_infty
Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021 (Oral)
☆162Updated 3 months ago
AAA-Zheng / Image-Text-Matching-Summary
Summary of Related Research on Image-Text Matching
☆72Updated 2 years ago
zdou0830 / METER
METER: A Multimodal End-to-end TransformER Framework
☆374Updated 3 years ago
BMC-SDNU / Cross-Modal-Retrieval
Cross-Modal-Real-valuded-Retrieval
☆86Updated 2 years ago
zengyan-97 / X-VLM
X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)
☆487Updated 3 years ago
LgQu / DIME
Dynamic Modality Interaction Modeling for Image-Text Retrieval. SIGIR'21
☆71Updated 3 years ago
CrossmodalGroup / NAAF
Implementation of our CVPR2022 paper, Negative-Aware Attention Framework for Image-Text Matching.
☆120Updated 2 years ago
XinyuXia97 / UCMFH
Source codes of the paper "When CLIP meets Cross-modal Hashing Retrieval: A New Strong Baseline"
☆33Updated 5 months ago
yawenzeng / Awesome-Cross-Modal-Video-Moment-Retrieval
前沿论文持续更新--视频时刻定位 or 时域语言定位 or 视频片段检索。
☆258Updated 2 years ago
phellonchen / awesome-Vision-and-Language-Pre-training
Recent Advances in Vision and Language Pre-training (VLP)
☆296Updated 2 years ago
CrossmodalGroup / GSMN
Implementation of our CVPR2020 paper, Graph Structured Network for Image-Text Matching
☆169Updated 5 years ago
CrossmodalGroup / CMCAN
Implementation of our AAAI2022 paper, Show Your Faith: Cross-Modal Confidence-Aware Network for Image-Text Matching.
☆36Updated 2 years ago
MILVLG / bottom-up-attention.pytorch
A PyTorch reimplementation of bottom-up-attention models
☆304Updated 3 years ago
foolwood / DRL
[arXiv22] Disentangled Representation Learning for Text-Video Retrieval
☆97Updated 3 years ago
layer6ai-labs / xpool
https://layer6ai-labs.github.io/xpool/
☆131Updated 2 years ago
XLearning-SCU / 2021-NeurIPS-NCR
☆78Updated 2 years ago
danieljf24 / awesome-video-text-retrieval
A curated list of deep learning resources for video-text retrieval.
☆638Updated 2 years ago
forence / Awesome-Visual-Captioning
This repository focus on Image Captioning & Video Captioning & Seq-to-Seq Learning & NLP
☆414Updated 3 years ago
KunpengLi1994 / VSRN
PyTorch code for ICCV'19 paper "Visual Semantic Reasoning for Image-Text Matching"
☆302Updated 5 years ago
PKU-ICST-MIPL / MKVSE-TOMM2023
☆29Updated 2 years ago
232525 / PureT
Implementation of 'End-to-End Transformer Based Model for Image Captioning' [AAAI 2022]
☆69Updated last year
terry-r123 / Awesome-Captioning
A curated list of Multimodal Captioning related research(including image captioning, video captioning, and text captioning)
☆112Updated 3 years ago
Yutong-Zhou-cv / Awesome-Multimodality
A Survey on multimodal learning research.
☆334Updated 2 years ago
luo3300612 / image-captioning-DLCT
Official pytorch implementation of paper "Dual-Level Collaborative Transformer for Image Captioning" (AAAI 2021).
☆202Updated 3 years ago
cshizhe / hgr_v2t
Code accompanying the paper "Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning".
☆212Updated 5 years ago
LgQu / CAMERA
Context-Aware Multi-View Summarization Network for Image-Text Matching. ACM MM'20
☆30Updated 3 years ago
zhixiongz / CLIP4CMR
A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval
☆43Updated 3 years ago
cyh-sj / CGMN
The code of the paper "Cross-Modal Graph Matching Network for Image-Text Retrieval" in ACM Transactions on Multimedia Computing, Communic…
☆46Updated 2 years ago