WillDreamer / Aurora
[NeurIPS2023] Parameter-efficient Tuning of Large-scale Multimodal Foundation Model
☆87Updated last year
Alternatives and similar repositories for Aurora:
Users that are interested in Aurora are comparing it to the libraries listed below
- [ICML 2024] Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning☆49Updated 11 months ago
- This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding strat…☆77Updated last month
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆81Updated last year
- ☆91Updated last year
- [CVPR 2025 Highlight] Official Pytorch codebase for paper: "Assessing and Learning Alignment of Unimodal Vision and Language Models"☆32Updated last week
- ☆41Updated 3 months ago
- [CVPR-2024] Official implementations of CLIP-KD: An Empirical Study of CLIP Model Distillation☆106Updated 9 months ago
- Code release for VTW (AAAI 2025) Oral☆34Updated 2 months ago
- [CVPR 2024] The official pytorch implementation of "A General and Efficient Training for Transformer via Token Expansion".☆44Updated 11 months ago
- [CVPR 2024] Offical implemention of the paper "DePT: Decoupled Prompt Tuning"☆96Updated 4 months ago
- Code and Dataset for the paper "LAMM: Label Alignment for Multi-Modal Prompt Learning" AAAI 2024☆32Updated last year
- [NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment☆57Updated 6 months ago
- Distilling Large Vision-Language Model with Out-of-Distribution Generalizability (ICCV 2023)☆56Updated last year
- The official implementation of RAR☆85Updated last year
- [CVPR 2024] LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge☆137Updated 8 months ago
- [CVPR 2024] Official Code for the Paper "Compositional Chain-of-Thought Prompting for Large Multimodal Models"☆122Updated 9 months ago
- CLIP-MoE: Mixture of Experts for CLIP☆31Updated 6 months ago
- ☆85Updated 2 years ago
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆129Updated 4 months ago
- [ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation☆127Updated 2 weeks ago
- ☆34Updated 9 months ago
- ☆115Updated 8 months ago
- ☆25Updated 11 months ago
- Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs☆90Updated 3 months ago
- [ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"☆53Updated last year
- MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)☆35Updated last year
- Code for "DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets", accepted at Neurips 2023 (Main confer…☆21Updated last year
- The official implementation for MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning (CVPR '24)☆45Updated last month
- ✨✨ [ICLR 2025] MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?☆108Updated last month
- [ECCV 2024] Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs☆112Updated 5 months ago