JierunChen / Ref-L4
Evaluation code for Ref-L4, a new REC benchmark in the LMM era
☆13Updated 2 months ago
Related projects: ⓘ
- ☆31Updated 3 months ago
- DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception☆103Updated 3 weeks ago
- ☆20Updated 9 months ago
- Stay tuned!☆11Updated 5 months ago
- ChatterBox: Multi-round Multimodal Referring and Grounding, Multimodal, Multi-round dialogues☆49Updated 4 months ago
- Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'☆44Updated 3 weeks ago
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆58Updated 2 weeks ago
- ☆100Updated last month
- Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want☆52Updated 5 months ago
- Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".☆54Updated last year
- ☆46Updated 10 months ago
- ☆32Updated 8 months ago
- VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation☆84Updated last week
- VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)☆21Updated 2 months ago
- The official implementation of RAR☆61Updated 5 months ago
- [ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM☆47Updated 2 months ago
- [ECCV2024] Official code implementation of Merlin: Empowering Multimodal LLMs with Foresight Minds☆80Updated 2 months ago
- The official GitHub page for ''What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Ins…☆18Updated 10 months ago
- [ECCV 2024] Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs☆45Updated last month
- ☆32Updated 3 months ago
- Official repo for StableLLAVA☆90Updated 8 months ago
- ☆83Updated 9 months ago
- Adapting LLaMA Decoder to Vision Transformer☆25Updated 4 months ago
- Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"☆103Updated last month
- OpenMMLab Detection Toolbox and Benchmark for V3Det☆15Updated 5 months ago
- FreeVA: Offline MLLM as Training-Free Video Assistant☆42Updated 3 months ago
- [ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Lea…☆96Updated 4 months ago
- Dense Connector for MLLMs☆98Updated last month
- ☆93Updated 3 months ago
- [EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆67Updated 5 months ago