zerovl / ZeroVLLinks

[ECCV2022] Contrastive Vision-Language Pre-training with Limited Resources

☆45

Alternatives and similar repositories for ZeroVL

Users that are interested in ZeroVL are comparing it to the libraries listed below

Sorting:

intersun / LightningDOT
source code and pre-trained/fine-tuned checkpoint for NAACL 2021 paper LightningDOT
☆72Updated 3 years ago
microsoft / UniTAB
UniTAB: Unifying Text and Box Outputs for Grounded VL Modeling, ECCV 2022 (Oral Presentation)
☆89Updated 2 years ago
facebookresearch / diht
Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
☆138Updated 2 years ago
guilk / VLC
Research code for "Training Vision-Language Transformers from Captions Alone"
☆34Updated 3 years ago
airsplay / vimpac
☆73Updated 3 years ago
ZhangYuanhan-AI / OmniBenchmark
[ECCV2022] New benchmark for evaluating pre-trained model; New supervised contrastive learning framework.
☆110Updated last year
showlab / Region_Learner
The Pytorch implementation for "Video-Text Pre-training with Learned Regions"
☆42Updated 3 years ago
Deferf / CLIP_Video_Representation
Use CLIP to represent video for Retrieval Task
☆70Updated 4 years ago
tsujuifu / pytorch_violet
A PyTorch implementation of VIOLET
☆139Updated last year
researchmm / soho
[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
☆208Updated 3 years ago
zhanxlin / Product1M
Product1M
☆89Updated 3 years ago
samschulter / omnilabeltools
A Python toolkit for the OmniLabel benchmark providing code for evaluation and visualization
☆22Updated 9 months ago
microsoft / LAVENDER
A Unified Framework for Video-Language Understanding
☆60Updated 2 years ago
pals-ttic / adapting-CLIP
☆65Updated 2 years ago
allenai / gpv-1
A task-agnostic vision-language architecture as a step towards General Purpose Vision
☆92Updated 4 years ago
easonnie / mlp-vil
MLPs for Vision and Langauge Modeling (Coming Soon)
☆27Updated 3 years ago
igorbrigadir / DownloadConceptualCaptions
Reliably download millions of images efficiently
☆117Updated 4 years ago
mzhaoshuai / CenterCLIP
[SIGIR 2022] CenterCLIP: Token Clustering for Efficient Text-Video Retrieval. Also, a text-video retrieval toolbox based on CLIP + fast p…
☆133Updated 3 years ago
Vision-CAIR / LTVRR
☆35Updated 2 years ago
Cuberick-Orion / CIRR
Official repository of ICCV 2021 - Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models
☆124Updated last month
adobe-research / vaw_dataset
This repository provides data for the VAW dataset as described in the CVPR 2021 paper titled "Learning to Predict Visual Attributes in th…
☆68Updated 3 years ago
zhjohnchan / SK-VG
[CVPR-2023] The official dataset of Advancing Visual Grounding with Scene Knowledge: Benchmark and Method.
☆32Updated 2 years ago
klauscc / VindLU
☆110Updated 2 years ago
gaopengcuhk / Pretrained-Pix2Seq
Replication of Pix2Seq with Pretrained Model
☆59Updated 4 years ago
facebookresearch / CiT
Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".
☆78Updated 2 years ago
jayleicn / singularity
[ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"
☆136Updated 2 years ago
postBG / CosMo.pytorch
Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback presented in CVPR 2021.
☆66Updated 3 years ago
zmykevin / UVLP
CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment
☆22Updated 3 years ago
VALUE-Leaderboard / StarterCode
Starter Code for VALUE benchmark
☆80Updated 3 years ago
facebookresearch / OTTER
This code provides a PyTorch implementation for OTTER (Optimal Transport distillation for Efficient zero-shot Recognition), as described …
☆69Updated 3 years ago