xinke-wang / ModaVerseLinks
[CVPR2024] ModaVerse: Efficiently Transforming Modalities with LLMs
☆29Updated last year
Alternatives and similar repositories for ModaVerse
Users that are interested in ModaVerse are comparing it to the libraries listed below
Sorting:
- [MM2024, oral] "Self-Supervised Visual Preference Alignment" https://arxiv.org/abs/2404.10501☆56Updated 11 months ago
- ☆55Updated last year
- ☆108Updated last year
- [NeurIPS 2024] MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models☆68Updated 2 months ago
- [ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation☆89Updated 7 months ago
- [ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model☆46Updated 8 months ago
- ☆90Updated 3 months ago
- Official PyTorch implementation for "Diffusion Models and Semi-Supervised Learners Benefit Mutually with Few Labels"☆94Updated last year
- [ICLR 2024] SemiReward: A General Reward Model for Semi-supervised Learning☆70Updated last year
- [NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment☆58Updated 9 months ago
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆74Updated 7 months ago
- ☆134Updated last year
- [ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"☆241Updated last year
- [NeurIPS'24 Oral] HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning☆216Updated 7 months ago
- [ICML 2025] Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in…☆141Updated last week
- HallE-Control: Controlling Object Hallucination in LMMs☆31Updated last year
- SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models☆124Updated 2 months ago
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning☆86Updated last year
- ☆20Updated 8 months ago
- Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning☆157Updated 2 years ago
- [TMLR] Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"☆143Updated 8 months ago
- [ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models☆96Updated 9 months ago
- ☆147Updated 10 months ago
- CLIP-MoE: Mixture of Experts for CLIP☆42Updated 9 months ago
- This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding strat…☆78Updated 4 months ago
- EMPO, A Fully Unsupervised RLVR Method☆51Updated this week
- Official code for our paper, "LoRA-Pro: Are Low-Rank Adapters Properly Optimized? "☆126Updated 3 months ago
- (ICLR2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆35Updated 2 weeks ago
- [ICCV 2025] ONLY: One-Layer Intervention Sufficiently Mitigates Hallucinations in Large Vision-Language Models☆15Updated last week
- Official Pytorch Implementation of Our CVPR2023 Paper: "Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image…☆59Updated last year