xinke-wang / ModaVerse
[CVPR2024] ModaVerse: Efficiently Transforming Modalities with LLMs
☆29Updated 7 months ago
Alternatives and similar repositories for ModaVerse:
Users that are interested in ModaVerse are comparing it to the libraries listed below
- [MM2024, oral] "Self-Supervised Visual Preference Alignment" https://arxiv.org/abs/2404.10501☆50Updated 7 months ago
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆73Updated 3 months ago
- ☆95Updated 7 months ago
- UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model☆21Updated 6 months ago
- This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding strat…☆75Updated last week
- [ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model☆34Updated 3 months ago
- [ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation☆38Updated 2 months ago
- [NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment☆57Updated 5 months ago
- HallE-Control: Controlling Object Hallucination in LMMs☆29Updated 10 months ago
- LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models☆116Updated 9 months ago
- ☆29Updated 7 months ago
- LMM solved catastrophic forgetting, AAAI2025☆38Updated 3 months ago
- The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate".☆96Updated 3 months ago
- [ECCV 2024] Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs☆100Updated 3 months ago
- ☆48Updated this week
- [ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models☆69Updated 4 months ago
- [ICLR2025] γ -MOD: Mixture-of-Depth Adaptation for Multimodal Large Language Models☆31Updated 2 weeks ago
- PyTorch Implementation of "Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Larg…☆20Updated 3 weeks ago
- ☆63Updated 7 months ago
- [NeurIPS 2024] MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models☆47Updated 2 months ago
- Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.☆32Updated last year
- Official code for our paper, "LoRA-Pro: Are Low-Rank Adapters Properly Optimized? "☆102Updated 4 months ago
- [CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompt…☆38Updated 2 months ago
- MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)☆33Updated 10 months ago
- [ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"☆81Updated 3 months ago
- ☆54Updated last year
- Data distillation benchmark☆55Updated 2 weeks ago
- CLIP-MoE: Mixture of Experts for CLIP☆24Updated 4 months ago
- Official code for ICLR 2024 paper, "A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation"☆76Updated 10 months ago