ilkerkesen / frozenLinks

A PyTorch implementation of Multimodal Few-Shot Learning with Frozen Language Models with OPT.

☆43

Alternatives and similar repositories for frozen

Users that are interested in frozen are comparing it to the libraries listed below

Sorting:

ylsung / VL_adapter
PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)
☆207Updated 2 years ago
yangbang18 / MultiCapCLIP
(ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
☆36Updated last year
goel-shashank / CyCLIP
☆120Updated 2 years ago
YulongBonjour / SimVLM
SimVLM ---SIMPLE VISUAL LANGUAGE MODEL PRETRAINING WITH WEAK SUPERVISION
☆36Updated 3 years ago
allenai / close
☆59Updated 2 years ago
Weixin-Liang / Modality-Gap
Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
☆165Updated 3 years ago
microsoft / FIBER
Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
☆130Updated 2 years ago
zengyan-97 / X2-VLM
All-In-One VLM: Image + Video + Transfer to Other Languages / Domains (TPAMI 2023)
☆166Updated last year
joeyz0z / ConZIC
Official implementation of "ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing"
☆74Updated 2 years ago
YeonwooSung / LIMoE-pytorch
PyTorch implementation of LIMoE
☆52Updated last year
PLUM-Lab / MultiInstruct
MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning
☆134Updated 2 years ago
RERV / UniAdapter
[ICLR2024] The official implementation of paper "UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling", by …
☆77Updated last year
microsoft / LAVENDER
A Unified Framework for Video-Language Understanding
☆60Updated 2 years ago
yuxiaochen1103 / FDT
☆62Updated 2 years ago
MikeWangWZHL / VidIL
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
☆115Updated 3 years ago
naver-ai / pcmepp
Official Pytorch implementation of "Improved Probabilistic Image-Text Representations" (ICLR 2024)
☆58Updated last year
edchengg / oven_eval
ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities
☆43Updated 5 months ago
jayleicn / singularity
[ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"
☆136Updated 2 years ago
DavidHuji / CapDec
CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)
☆201Updated last year
sIncerass / MVLPT
code for "Multitask Vision-Language Prompt Tuning" https://arxiv.org/abs/2211.11720
☆57Updated last year
naver-ai / pcme
Official Pytorch implementation of "Probabilistic Cross-Modal Embedding" (CVPR 2021)
☆134Updated last year
Computer-Vision-in-the-Wild / Elevater_Toolkit_IC
Toolkit for Elevater Benchmark
☆76Updated 2 years ago
md-mohaiminul / ViS4mer
☆57Updated 3 years ago
zinengtang / TVLT
PyTorch code for “TVLT: Textless Vision-Language Transformer” (NeurIPS 2022 Oral)
☆124Updated 2 years ago
amazon-science / mix-generation
MixGen: A New Multi-Modal Data Augmentation
☆126Updated 2 years ago
ChenDelong1999 / polite-flamingo
🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)
☆64Updated last year
Yui010206 / SeViLA
[NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answering
☆189Updated last year
MikeWangWZHL / Paxion
Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight
☆37Updated 2 years ago
fawazsammani / nlxgpt
NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks, CVPR 2022 (Oral)
☆48Updated last year
thunlp / CPT
Colorful Prompt Tuning for Pre-trained Vision-Language Models
☆49Updated 3 years ago