William-wAng618 / M2PTLinks

Official repo of M$^2$PT: Multimodal Prompt Tuning for Zero-shot Instruction Learning

☆27

Alternatives and similar repositories for M2PT

Users that are interested in M2PT are comparing it to the libraries listed below

Sorting:

zackschen / CoIN
Instruction Tuning in Continual Learning paradigm
☆66Updated 9 months ago
zycheiheihei / Transferable-Visual-Prompting
[CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompt…
☆46Updated 11 months ago
Ruiyang-061X / VL-Uncertainty
🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".
☆46Updated 8 months ago
Lackel / AGLA
[CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention
☆50Updated last year
yu-rp / apiprompting
[ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models
☆106Updated last year
leaves162 / CLIPtrase
cliptrase
☆47Updated last year
seilk / VisAttnSink
[ICLR 2025] See What You Are Told: Visual Attention Sink in Large Multimodal Models
☆69Updated 9 months ago
lloongx / DIKI
[ECCV 2024] Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models
☆54Updated last year
BatsResearch / menghini-neurips23-code
Exploring prompt tuning with pseudolabels for multiple modalities, learning settings, and training strategies.
☆50Updated last year
ZhangqiJiang07 / middle_layers_indicating_hallucinations
[CVPR 2025] Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Att…
☆52Updated last month
mrflogs / ICLR24
Official code for ICLR 2024 paper, "A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation"
☆85Updated last year
BeierZhu / GLA
[NeurIPS 2023] Generalized Logit Adjustment
☆39Updated last year
Ziwei-Zheng / Nullu
Code for paper: Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection
☆46Updated 8 months ago
silicx / LoRS_Distill
Code for our ICML'24 on multimodal dataset distillation
☆41Updated last year
double125 / MADTP
MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer
☆48Updated last year
mengcaopku / Continual-LLaVA
☆16Updated last year
Yangsenqiao / vida
[ICLR 2024] ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation
☆71Updated last year
ChengHan111 / E2VPT
Official Pytorch implementation of "E2VPT: An Effective and Efficient Approach for Visual Prompt Tuning". (ICCV2023)
☆70Updated last year
ziplab / SPT
[ICCV 2023 oral] This is the official repository for our paper: ''Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning''.
☆74Updated 2 years ago
mrflogs / SHIP
Official code for ICCV 2023 paper, "Improving Zero-Shot Generalization for CLIP with Synthesized Prompts"
☆103Updated last year
OpenSparseLLMs / CLIP-MoE
CLIP-MoE: Mixture of Experts for CLIP
☆50Updated last year
CRIPAC-DIG / LogicCheckGPT
[ACL 2024] Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models. Detect and mitigate object hallucinatio…
☆24Updated 9 months ago
princetonvisualai / multimodal_dataset_distillation
☆59Updated 10 months ago
machengcheng2016 / Subspace-Prompt-Learning
Official code for "Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models" (TCSVT'2023)
☆28Updated last year
Koorye / DePT
[CVPR 2024] Offical implemention of the paper "DePT: Decoupled Prompt Tuning"
☆109Updated 5 months ago
Qinyu-Allen-Zhao / LVLM-LP
The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?
☆40Updated last year
meetdavidwan / crg
PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"
☆37Updated last year
runtsang / VFPT
【NeurIPS 2024】Official implementation of "Visual Fourier Prompt Tuning"
☆36Updated 10 months ago
zhangce01 / DPE-CLIP
[NeurIPS 2024] Code for Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models
☆44Updated 8 months ago
Sreyan88 / VDGD
Code for ICLR 2025 Paper: Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs
☆22Updated 6 months ago