MCR-PEFT / C-MCR
☆34Updated 11 months ago
Related projects: ⓘ
- Multimodal Learning Method MLA for CVPR 2024☆36Updated 3 months ago
- The repo for "Enhancing Multi-modal Cooperation via Sample-level Modality Valuation", CVPR 2024☆32Updated 2 months ago
- This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding strat…☆66Updated 5 months ago
- [ICML 2024] Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning☆41Updated 4 months ago
- The repo for "MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance", ICML 2024☆25Updated 2 months ago
- Code for dmrnet☆10Updated last month
- Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"☆34Updated 2 months ago
- Towards a Unified View on Visual Parameter-Efficient Transfer Learning☆26Updated last year
- ☆83Updated 9 months ago
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆75Updated 5 months ago
- Simple PyTorch implementation of "Libra: Building Decoupled Vision System on Large Language Models" (accepted by ICML 2024)☆41Updated 3 months ago
- The efficient tuning method for VLMs☆72Updated 6 months ago
- Code for paper "AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention"☆13Updated 2 months ago
- Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation (NeurIPS 2023)☆21Updated 11 months ago
- [ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"☆51Updated 11 months ago
- [Paper][AAAI2024]Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations☆104Updated 2 months ago
- (CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning☆31Updated last month
- [ICLR2024] The official implementation of paper "UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling", by …☆68Updated 7 months ago
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆23Updated 6 months ago
- [CVPR' 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding☆35Updated last month
- [NeurIPS2023] Parameter-efficient Tuning of Large-scale Multimodal Foundation Model☆80Updated 9 months ago
- [ICCV2023] The repo for "Boosting Multi-modal Model Performance with Adaptive Gradient Modulation".☆24Updated 7 months ago
- [NeurIPS 2023]DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models☆30Updated 6 months ago
- ☆25Updated last year
- Visual self-questioning for large vision-language assistant.☆22Updated 3 weeks ago
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆25Updated last month
- MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU☆37Updated 11 months ago
- Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models (AAAI 2024)☆62Updated 7 months ago
- [Arxiv] Calibrated Self-Rewarding Vision Language Models☆35Updated 3 months ago
- [IJCAI 2023] Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment☆44Updated 5 months ago