fawazsammani / awesome-vision-language-pretrainingLinks

Awesome Vision-Language Pretraining Papers

☆35

Alternatives and similar repositories for awesome-vision-language-pretraining

Users that are interested in awesome-vision-language-pretraining are comparing it to the libraries listed below

Sorting:

SivanDoveh / TSVLC
Repository for the paper: Teaching Structured Vision & Language Concepts to Vision & Language Models
☆47Updated 2 years ago
BeierZhu / Prompt-align
[ICCV 2023] Prompt-aligned Gradient for Prompt Tuning
☆167Updated 2 years ago
guozix / TaI-DPT
☆94Updated 2 years ago
deepglint / ALIP
[ICCV 2023] ALIP: Adaptive Language-Image Pre-training with Synthetic Caption
☆101Updated 2 years ago
vishaal27 / SuS-X
Code for the paper: "SuS-X: Training-Free Name-Only Transfer of Vision-Language Models" [ICCV'23]
☆105Updated 2 years ago
allenai / reclip
☆88Updated 3 years ago
yuxiaochen1103 / FDT
☆62Updated 2 years ago
sarahpratt / CuPL
☆193Updated 2 years ago
thunlp / PEVL
Source code for EMNLP 2022 paper “PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models”
☆48Updated 2 years ago
luogen1996 / SimREC
A lightweight codebase for referring expression comprehension and segmentation
☆55Updated 3 years ago
Computer-Vision-in-the-Wild / Elevater_Toolkit_IC
Toolkit for Elevater Benchmark
☆75Updated 2 years ago
bladewaltz1 / PromptSwitch
☆30Updated 2 years ago
allenai / close
☆59Updated 2 years ago
chunmeifeng / SPRC
【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval
☆89Updated last year
LeapLabTHU / Pseudo-Q
[CVPR 2022] Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding
☆152Updated last year
yangli18 / VLTVG
Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning, CVPR 2022
☆96Updated 2 years ago
joeyz0z / ConZIC
Official implementation of "ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing"
☆74Updated 2 years ago
HJYao00 / Side4Video
☆42Updated last year
IIGROUP / MAP
☆37Updated 3 years ago
vinid / neg_clip
NegCLIP.
☆37Updated 2 years ago
sail-sg / ptp
[CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》
☆151Updated 2 years ago
Jiaxuan-Li / EVCap
[CVPR 2024] Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension
☆58Updated last year
geekyutao / TaskRes
Task Residual for Tuning Vision-Language Models (CVPR 2023)
☆73Updated 2 years ago
lezhang7 / Enhance-FineGrained
[CVPR 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding
☆51Updated 6 months ago
tonyhuang2022 / UPL
This repo is the official implementation of UPL (Unsupervised Prompt Learning for Vision-Language Models).
☆117Updated 3 years ago
htyao89 / KgCoOp
☆104Updated last year
yuhangzang / UPT
☆60Updated 6 months ago
thunlp / CPT
Colorful Prompt Tuning for Pre-trained Vision-Language Models
☆49Updated 3 years ago
kkzhang95 / Awesome-Composed-Multi-modal-Retrieval
A comprehensive survey of Composed Multi-modal Retrieval (CMR), including Composed Image Retrieval (CIR) and Composed Video Retrieval (CV…
☆58Updated 2 months ago
seanzhuh / SeqTR
SeqTR: A Simple yet Universal Network for Visual Grounding
☆144Updated last year