MCR-PEFT / C-MCRLinks
☆40Updated last month
Alternatives and similar repositories for C-MCR
Users that are interested in C-MCR are comparing it to the libraries listed below
Sorting:
- [ICLR 23 oral] The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation☆45Updated 2 years ago
- A python implement for Certifiable Robust Multi-modal Training☆19Updated 3 weeks ago
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆80Updated last year
- [NeurIPS 2024] MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models☆68Updated 2 months ago
- [Paper][AAAI2024]Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations☆144Updated last year
- [ICLR2024] The official implementation of paper "UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling", by …☆74Updated last year
- ☆20Updated 7 months ago
- Code for DeCo: Decoupling token compression from semanchc abstraction in multimodal large language models☆40Updated this week
- [ICML 2024] Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning☆49Updated last year
- [CVPR 2025 Highlight] Official Pytorch codebase for paper: "Assessing and Learning Alignment of Unimodal Vision and Language Models"☆46Updated last month
- [NeurIPS2023] Parameter-efficient Tuning of Large-scale Multimodal Foundation Model☆88Updated last year
- [ICLR 2024] SemiReward: A General Reward Model for Semi-supervised Learning☆70Updated last year
- The repo for "MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance", ICML 2024☆43Updated last year
- [ICML 2025] Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in…☆141Updated last week
- AlignCLIP: Improving Cross-Modal Alignment in CLIP (ICLR 2025)☆42Updated 4 months ago
- (CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning☆48Updated 11 months ago
- ☆27Updated 2 years ago
- Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"☆31Updated 9 months ago
- [ICCV2023] The repo for "Boosting Multi-modal Model Performance with Adaptive Gradient Modulation".☆25Updated last year
- The repo for "Enhancing Multi-modal Cooperation via Sample-level Modality Valuation", CVPR 2024☆53Updated 8 months ago
- [CVPR 2023 Highlight & TPAMI] Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning☆120Updated 6 months ago
- ☆32Updated 3 months ago
- The efficient tuning method for VLMs☆80Updated last year
- [ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"☆241Updated last year
- Distilling Large Vision-Language Model with Out-of-Distribution Generalizability (ICCV 2023)☆58Updated last year
- [CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval☆60Updated last year
- [IJCAI 2023] Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment☆52Updated last year
- [ICCV 2023] ViLLA: Fine-grained vision-language representation learning from real-world data☆44Updated last year
- Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension☆23Updated 8 months ago
- Code for dmrnet☆25Updated last month