HKUST-LongGroup / CoMM
Official repository for CoMM Dataset
☆16Updated this week
Related projects: ⓘ
- [Arxiv] Calibrated Self-Rewarding Vision Language Models☆35Updated 3 months ago
- Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment☆44Updated 3 months ago
- code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"☆40Updated 3 weeks ago
- The official implementation of 《MLLMs-Augmented Visual-Language Representation Learning》☆31Updated 6 months ago
- [EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆67Updated 5 months ago
- Official Dataloader and Evaluation Scripts for LongVideoBench.☆52Updated last month
- A collection of visual instruction tuning datasets.☆74Updated 6 months ago
- [NeurIPS 2023 Datasets and Benchmarks] "FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation", Yuanxin L…☆47Updated 6 months ago
- ☆110Updated 4 months ago
- VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)☆21Updated 2 months ago
- Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".☆54Updated last year
- This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding strat…☆66Updated 5 months ago
- HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data (Accepted by CVPR 2024)☆39Updated 2 months ago
- Official repository of MMDU dataset☆61Updated last month
- ☆19Updated last month
- ☆83Updated 9 months ago
- ☆128Updated 8 months ago
- Simple PyTorch implementation of "Libra: Building Decoupled Vision System on Large Language Models" (accepted by ICML 2024)☆41Updated 3 months ago
- Dataset pruning for ImageNet and LAION-2B.☆62Updated 2 months ago
- The official code of paper "Automated Multi-level Preference for MLLMs"☆15Updated 3 weeks ago
- A Survey on Benchmarks of Multimodal Large Language Models☆30Updated last month
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆32Updated 10 months ago
- Dense Connector for MLLMs☆98Updated last month
- ☆20Updated 4 months ago
- Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'☆44Updated 3 weeks ago
- ☆73Updated 8 months ago
- 🌋👵🏻 Yo'LLaVA: Your Personalized Language and Vision Assistant☆47Updated last week
- ☆53Updated 7 months ago
- The official implementation of RAR☆61Updated 5 months ago
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆51Updated 3 months ago