Sense-GVT / DeCLIPLinks

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

☆666

Alternatives and similar repositories for DeCLIP

Users that are interested in DeCLIP are comparing it to the libraries listed below

Sorting:

microsoft / UniCL
[CVPR 2022] Official code for "Unified Contrastive Learning in Image-Text-Label Space"
☆403Updated last year
NVlabs / GroupViT
Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges from Text Supervision, CVPR 2022.
☆773Updated 3 years ago
Zasder3 / train-CLIP
A PyTorch Lightning solution to training OpenAI's CLIP from scratch.
☆713Updated 3 years ago
raoyongming / DenseCLIP
[CVPR 2022] DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
☆537Updated 2 years ago
facebookresearch / SLIP
Code release for SLIP Self-supervision meets Language-Image Pre-training
☆782Updated 2 years ago
facebookresearch / flip
Official Open Source code for "Scaling Language-Image Pre-training via Masking"
☆428Updated 2 years ago
microsoft / RegionCLIP
[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"
☆792Updated last year
gaopengcuhk / Tip-Adapter
☆639Updated last year
gaopengcuhk / CLIP-Adapter
☆551Updated 3 years ago
bytedance / ibot
iBOT : Image BERT Pre-Training with Online Tokenizer (ICLR 2022)
☆748Updated 3 years ago
zengyan-97 / X-VLM
X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)
☆485Updated 2 years ago
hila-chefer / Transformer-MM-Explainability
[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decode…
☆870Updated 2 years ago
ashkamath / mdetr
☆1,036Updated 3 years ago
LightDXY / FT-CLIP
CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet
☆223Updated 2 years ago
phellonchen / awesome-Vision-and-Language-Pre-training
Recent Advances in Vision and Language Pre-training (VLP)
☆294Updated 2 years ago
mlfoundations / wise-ft
Robust fine-tuning of zero-shot models
☆744Updated 3 years ago
clip-vil / CLIP-ViL
[ICLR 2022] code for "How Much Can CLIP Benefit Vision-and-Language Tasks?" https://arxiv.org/abs/2107.06383
☆415Updated 2 years ago
lucidrains / x-clip
A concise but complete implementation of CLIP with various experimental improvements from recent papers
☆716Updated 2 years ago
zdou0830 / METER
METER: A Multimodal End-to-end TransformER Framework
☆373Updated 2 years ago
uta-smile / TCL
code for TCL: Vision-Language Pre-Training with Triple Contrastive Learning, CVPR 2022
☆266Updated last year
lucidrains / CoCa-pytorch
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch
☆1,178Updated last year
google-research-datasets / conceptual-12m
Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.
☆404Updated 3 months ago
yzhuoning / Awesome-CLIP
Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).
☆1,218Updated last year
google-research / pix2seq
Pix2Seq codebase: multi-tasks with generative modeling (autoregressive and diffusion)
☆930Updated last year
jayleicn / ClipBERT
[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning…
☆722Updated 2 years ago
Alpha-VL / ConvMAE
ConvMAE: Masked Convolution Meets Masked Autoencoders
☆516Updated 2 years ago
ArrowLuo / CLIP4Clip
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
☆993Updated last year
fundamentalvision / Uni-Perceiver
☆283Updated 2 months ago
LijieFan / LaCLIP
[NeurIPS 2023] Text data, code and pre-trained models for paper "Improving CLIP Training with Language Rewrites"
☆286Updated last year
microsoft / XPretrain
Multi-modality pre-training
☆503Updated last year