EIT-NLP / Connector-Selection-for-MLLM
[EMNLP 2024 Main] Official implementation of the paper "To Preserve or To Compress: An In-Depth Study of Connector Selection in Multimodal Large Language Models". (by Junyan Lin)
☆13Updated last month
Alternatives and similar repositories for Connector-Selection-for-MLLM:
Users that are interested in Connector-Selection-for-MLLM are comparing it to the libraries listed below
- VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models☆40Updated 6 months ago
- [NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'☆137Updated last week
- [ECCV 2024] Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs☆94Updated 2 months ago
- A Self-Training Framework for Vision-Language Reasoning☆60Updated 2 months ago
- The official implementation of RAR☆79Updated 9 months ago
- ☆63Updated 2 months ago
- 🔥🔥🔥Code for "Empowering Multimodal Large Language Models with Evol-Instruct"☆12Updated last month
- [EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.☆52Updated 2 weeks ago
- [Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought …☆206Updated last month
- Official repository of MMDU dataset☆82Updated 3 months ago
- This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vi…☆97Updated 3 months ago
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆79Updated 10 months ago
- ☆59Updated 7 months ago
- MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU☆44Updated last year
- Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"☆82Updated last month
- ☆60Updated this week
- VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection☆46Updated this week
- [ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"☆78Updated last month
- A Survey on Benchmarks of Multimodal Large Language Models☆79Updated 2 weeks ago
- ☆14Updated last year
- ☆25Updated 6 months ago
- [EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆77Updated 9 months ago
- PyTorch Implementation of "Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Larg…☆19Updated last month
- Official repository of DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models☆78Updated 4 months ago
- ☆36Updated 2 weeks ago
- up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources☆72Updated last week
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆61Updated 7 months ago
- [CVPR 2024] Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension☆43Updated 9 months ago
- ☆32Updated last week
- Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal …☆38Updated last month