MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria
☆72Oct 16, 2024Updated last year
Alternatives and similar repositories for MLLM-Bench
Users that are interested in MLLM-Bench are comparing it to the libraries listed below
Sorting:
- [ICML2024] Repo for the paper `Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models'☆22Jan 1, 2025Updated last year
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆23Feb 9, 2025Updated last year
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆36Jul 11, 2024Updated last year
- code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"☆60Aug 23, 2024Updated last year
- [ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model☆47Nov 10, 2024Updated last year
- This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for E…☆546Feb 12, 2026Updated 2 weeks ago
- 【NeurIPS 2024】The official code of paper "Automated Multi-level Preference for MLLMs"☆22Sep 26, 2024Updated last year
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆86Oct 26, 2025Updated 4 months ago
- Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization☆100Jan 30, 2024Updated 2 years ago
- (CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.☆360Jan 14, 2025Updated last year
- ☆88Jul 4, 2024Updated last year
- VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)☆42Dec 16, 2025Updated 2 months ago
- [EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆107Aug 21, 2025Updated 6 months ago
- Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model☆281Jun 25, 2024Updated last year
- [NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents☆318Apr 16, 2024Updated last year
- codes for ICML2021 paper iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients☆10May 27, 2021Updated 4 years ago
- MaXM is a suite of test-only benchmarks for multilingual visual question answering in 7 languages: English (en), French (fr), Hindi (hi),…☆13Jan 16, 2024Updated 2 years ago
- VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models☆77Jul 13, 2024Updated last year
- [CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"☆52Jun 16, 2025Updated 8 months ago
- Official code for "Evaluations of Machine Learning Privacy Defenses are Misleading" (https://arxiv.org/abs/2404.17399)☆12Apr 29, 2024Updated last year
- [CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts☆336Jul 17, 2024Updated last year
- [CVPR 2024] LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge☆153Sep 3, 2025Updated 5 months ago
- [NAACL 2025 Oral] Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models☆54Updated this week
- [ICLR 2025] Mathematical Visual Instruction Tuning for Multi-modal Large Language Models☆152Dec 5, 2024Updated last year
- [ICLR2025] MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models☆94Sep 14, 2024Updated last year
- ☆55Apr 1, 2024Updated last year
- [MM2024, oral] "Self-Supervised Visual Preference Alignment" https://arxiv.org/abs/2404.10501☆61Jul 26, 2024Updated last year
- Mitigating Open-Vocabulary Caption Hallucinations (EMNLP 2024)☆18Oct 18, 2024Updated last year
- FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language Models☆32Nov 27, 2025Updated 3 months ago
- ✨✨ [ICLR 2025] MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?☆152Oct 21, 2025Updated 4 months ago
- Multilingual Medicine: Model, Dataset, Benchmark, Code☆199Oct 15, 2024Updated last year
- Official repository of MMDU dataset☆104Sep 29, 2024Updated last year
- a family of highly capabale yet efficient large multimodal models☆191Aug 23, 2024Updated last year
- ☆15Jun 14, 2024Updated last year
- VHTest☆15Oct 31, 2024Updated last year
- [CVPR2025] Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆108May 29, 2025Updated 9 months ago
- AAAI 2024: Visual Instruction Generation and Correction☆96Feb 4, 2024Updated 2 years ago
- ☆16Oct 21, 2024Updated last year
- [EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"☆20Oct 17, 2024Updated last year