MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria
☆73Oct 16, 2024Updated last year
Alternatives and similar repositories for MLLM-Bench
Users that are interested in MLLM-Bench are comparing it to the libraries listed below
Sorting:
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆36Jul 11, 2024Updated last year
- Multilingual Medicine: Model, Dataset, Benchmark, Code☆199Oct 15, 2024Updated last year
- [ICML2024] Repo for the paper `Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models'☆23Jan 1, 2025Updated last year
- Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model☆281Jun 25, 2024Updated last year
- [ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model☆47Nov 10, 2024Updated last year
- FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language Models☆33Nov 27, 2025Updated 3 months ago
- MaXM is a suite of test-only benchmarks for multilingual visual question answering in 7 languages: English (en), French (fr), Hindi (hi),…☆13Jan 16, 2024Updated 2 years ago
- (CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.☆361Jan 14, 2025Updated last year
- This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for E…☆549Feb 12, 2026Updated last month
- We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…☆21Jan 11, 2026Updated 2 months ago
- VHTest☆16Oct 31, 2024Updated last year
- Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization☆100Jan 30, 2024Updated 2 years ago
- ☆90Jul 4, 2024Updated last year
- [CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"☆53Jun 16, 2025Updated 9 months ago
- 【NeurIPS 2024】The official code of paper "Automated Multi-level Preference for MLLMs"☆22Sep 26, 2024Updated last year
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆23Feb 9, 2025Updated last year
- [NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents☆318Apr 16, 2024Updated last year
- The "GPT-API-Accelerate" project provides a set of Python classes for accelerating the process of generating responses to prompts using t…☆23Oct 12, 2024Updated last year
- [MM2024, oral] "Self-Supervised Visual Preference Alignment" https://arxiv.org/abs/2404.10501☆62Jul 26, 2024Updated last year
- [ICLR2025] MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models☆95Sep 14, 2024Updated last year
- code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"☆62Aug 23, 2024Updated last year
- [ICLR 2025] Mathematical Visual Instruction Tuning for Multi-modal Large Language Models☆153Dec 5, 2024Updated last year
- CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts☆163Jun 8, 2024Updated last year
- Mitigating Open-Vocabulary Caption Hallucinations (EMNLP 2024)☆18Oct 18, 2024Updated last year
- codes for ICML2021 paper iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients☆10May 27, 2021Updated 4 years ago
- [CVPR 2024] LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge☆153Sep 3, 2025Updated 6 months ago
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆87Oct 26, 2025Updated 4 months ago
- [EMNLP'23 Oral] ReSee: Responding through Seeing Fine-grained Visual Knowledge in Open-domain Dialogue PyTorch Implementation☆13Dec 4, 2023Updated 2 years ago
- Lightweight PDF Q&A tool powered by RAG (Retrieval-Augmented Generation) with MCP (Model Context Protocol) Support.☆22Oct 27, 2025Updated 4 months ago
- The official GitHub page for ''What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Ins…☆19Nov 10, 2023Updated 2 years ago
- [EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆110Aug 21, 2025Updated 6 months ago
- [NLPCC'23] ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles PyTorch Implementation☆14Oct 7, 2023Updated 2 years ago
- Official code for "Evaluations of Machine Learning Privacy Defenses are Misleading" (https://arxiv.org/abs/2404.17399)☆12Apr 29, 2024Updated last year
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆45Jun 14, 2024Updated last year
- ☆66Feb 5, 2024Updated 2 years ago
- [CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts☆336Jul 17, 2024Updated last year
- ☆23May 20, 2025Updated 10 months ago
- a family of highly capabale yet efficient large multimodal models☆193Aug 23, 2024Updated last year
- Official repo for StableLLAVA☆95Dec 22, 2023Updated 2 years ago