WillDreamer / AuroraLinks

[NeurIPS2023] Parameter-efficient Tuning of Large-scale Multimodal Foundation Model

☆88

Alternatives and similar repositories for Aurora

Users that are interested in Aurora are comparing it to the libraries listed below

Sorting:

JieShibo / MemVP
[ICML 2024] Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning
☆49Updated last year
winycg / CLIP-KD
[CVPR-2024] Official implementations of CLIP-KD: An Empirical Study of CLIP Model Distillation
☆123Updated last year
yfzhang114 / LLaVA-Align
[ACM Multimedia 2025] This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual…
☆81Updated 5 months ago
ZhengYu518 / VL-Mamba
Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"
☆82Updated last year
palchenli / VL-Instruction-Tuning
☆91Updated last year
Koorye / DePT
[CVPR 2024] Offical implemention of the paper "DePT: Decoupled Prompt Tuning"
☆107Updated 2 months ago
scale-lab / MTLoRA
The official implementation for MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning (CVPR '24)
☆57Updated last month
UMass-Embodied-AGI / Mod-Squad
☆91Updated 2 years ago
lezhang7 / SAIL
[CVPR 2025 Highlight] Official Pytorch codebase for paper: "Assessing and Learning Alignment of Unimodal Vision and Language Models"
☆47Updated last month
ChengHan111 / E2VPT
Official Pytorch implementation of "E2VPT: An Effective and Efficient Approach for Visual Prompt Tuning". (ICCV2023)
☆72Updated last year
heliossun / SQ-LLaVA
Visual self-questioning for large vision-language assistant.
☆42Updated 2 weeks ago
yuecao0119 / MMFuser
The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …
☆57Updated 9 months ago
ziplab / SPT
[ICCV 2023 oral] This is the official repository for our paper: ''Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning''.
☆72Updated last year
JiuTian-VL / JiuTian-LION
[CVPR 2024] LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge
☆150Updated last year
JiuTian-VL / MoME
[NeurIPS 2024] MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models
☆69Updated 3 months ago
Liuziyu77 / RAR
The official implementation of RAR
☆90Updated last year
Paranioar / UniPT
[CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"
☆67Updated 9 months ago
Hodasia / Awesome-Vision-Language-Finetune
Awesome List of Vision Language Prompt Papers
☆46Updated last year
ShuvenduRoy / CoPrompt
[ICLR'24] Consistency-guided Prompt Learning for Vision-Language Models
☆78Updated last year
MME-Benchmarks / MME-RealWorld
✨✨ [ICLR 2025] MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?
☆129Updated 5 months ago
HenryHZY / VL-PET
[ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"
☆53Updated last year
gyhdog99 / MoCLE
MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)
☆42Updated last month
minglllli / CLS-RL
Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning
☆56Updated 2 months ago
BAAI-DCAI / DataOptim
A collection of visual instruction tuning datasets.
☆76Updated last year
JieShibo / PETL-ViT
[ICCV 2023 & AAAI 2023] Binary Adapters & FacT, [Tech report] Convpass
☆192Updated 2 years ago
linyq2117 / TagCLIP
[AAAI 2024] TagCLIP: A Local-to-Global Framework to Enhance Open-Vocabulary Multi-Label Classification of CLIP Without Training
☆100Updated last year
yaolinli / DeCo
Code for DeCo: Decoupling token compression from semanchc abstraction in multimodal large language models
☆61Updated 3 weeks ago
GasolSun36 / MVP
Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning
☆22Updated 10 months ago
SY-Xuan / Pink
Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs
☆91Updated 6 months ago
mrflogs / SHIP
Official code for ICCV 2023 paper, "Improving Zero-Shot Generalization for CLIP with Synthesized Prompts"
☆101Updated last year