Yutong-Zhou-cv / Awesome-MultimodalityLinks

A Survey on multimodal learning research.

☆333

Alternatives and similar repositories for Awesome-Multimodality

Users that are interested in Awesome-Multimodality are comparing it to the libraries listed below

Sorting:

phellonchen / awesome-Vision-and-Language-Pre-training
Recent Advances in Vision and Language Pre-training (VLP)
☆294Updated 2 years ago
jianghaojun / Awesome-Parameter-Efficient-Transfer-Learning
A collection of parameter-efficient transfer learning papers focusing on computer vision and multimodal domains.
☆409Updated last year
wangxiao5791509 / MultiModal_BigModels_Survey
[MIR-2023-Survey] A continuously updated paper list for multi-modal pre-trained big models
☆288Updated 3 months ago
LijieFan / LaCLIP
[NeurIPS 2023] Text data, code and pre-trained models for paper "Improving CLIP Training with Language Rewrites"
☆286Updated last year
ys-zong / awesome-self-supervised-multimodal-learning
[T-PAMI] A curated list of self-supervised multimodal learning resources.
☆263Updated last year
ttengwang / Awesome_Prompting_Papers_in_Computer_Vision
A curated list of prompt-based paper in computer vision and vision-language learning.
☆925Updated last year
JindongGu / Awesome-Prompting-on-Vision-Language-Model
This repo lists relevant papers summarized in our survey paper: A Systematic Survey of Prompt Engineering on Vision-Language Foundation …
☆494Updated 7 months ago
uta-smile / TCL
code for TCL: Vision-Language Pre-Training with Triple Contrastive Learning, CVPR 2022
☆266Updated last year
amazon-science / mix-generation
MixGen: A New Multi-Modal Data Augmentation
☆126Updated 2 years ago
HenryHZY / Awesome-Multimodal-LLM
Research Trends in LLM-guided Multimodal Learning.
☆355Updated 2 years ago
Weixin-Liang / Modality-Gap
Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
☆161Updated 3 years ago
revantteotia / clip-training
Code to train CLIP model
☆122Updated 3 years ago
gaopengcuhk / CLIP-Adapter
☆551Updated 3 years ago
ucasligang / awesome-MIM
Reading list for research topics in Masked Image Modeling
☆336Updated 10 months ago
muzairkhattak / multimodal-prompt-learning
[CVPR 2023] Official repository of paper titled "MaPLe: Multi-modal Prompt Learning".
☆779Updated 2 years ago
ZhangYuanhan-AI / NOAH
[TPAMI] Searching prompt modules for parameter-efficient transfer learning.
☆235Updated last year
ziqipang / LM4VisualEncoding
[ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"
☆244Updated last year
zengyan-97 / X-VLM
X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)
☆485Updated 2 years ago
zengyan-97 / X2-VLM
All-In-One VLM: Image + Video + Transfer to Other Languages / Domains (TPAMI 2023)
☆165Updated last year
Atomic-man007 / Awesome_Multimodel_LLM
Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Mod…
☆342Updated 7 months ago
junchen14 / Multi-Modal-Transformer
The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-languag…
☆229Updated 3 years ago
zdou0830 / METER
METER: A Multimodal End-to-end TransformER Framework
☆373Updated 2 years ago
gaopengcuhk / Tip-Adapter
☆639Updated last year
RitaRamo / smallcap
SmallCap: Lightweight Image Captioning Prompted with Retrieval Augmentation
☆124Updated last year
awaisrauf / Awesome-CV-Foundational-Models
☆529Updated 11 months ago
mertyg / vision-language-models-are-bows
Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR …
☆286Updated 2 years ago
ylsung / VL_adapter
PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)
☆207Updated 2 years ago
Baijiong-Lin / LoRA-Torch
PyTorch Reimplementation of LoRA (featuring with supporting nn.MultiheadAttention in OpenCLIP)
☆72Updated 4 months ago
salesforce / ALPRO
Align and Prompt: Video-and-Language Pre-training with Entity Prompts
☆187Updated 5 months ago
ttengwang / Awesome_Long_Form_Video_Understanding
Awesome papers & datasets specifically focused on long-term videos.
☆321Updated 2 weeks ago