Lillianwei-h / MMIEView external linksLinks
[ICLR'25 Oral] MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
☆35Nov 3, 2024Updated last year
Alternatives and similar repositories for MMIE
Users that are interested in MMIE are comparing it to the libraries listed below
Sorting:
- [TACL] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Nov 22, 2024Updated last year
- [ECCV 2024] "REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models"☆13Aug 6, 2024Updated last year
- LMM for VQA, tcsvt version☆11Jul 19, 2024Updated last year
- ☆12Dec 4, 2024Updated last year
- [CVPR 2024] KEPP: Why Not Use Your Textbook? Knowledge-Enhanced Procedure Planning of Instructional Videos☆12Sep 24, 2024Updated last year
- [ACL 2025 🔥] Time Travel is a Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts☆18May 22, 2025Updated 8 months ago
- From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.☆24Oct 7, 2025Updated 4 months ago
- A Python library for processing and filtering TabLib☆13Aug 24, 2024Updated last year
- ☆21Jul 25, 2025Updated 6 months ago
- [EMNLP'24] RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models☆96Dec 13, 2024Updated last year
- ☆16Jul 10, 2022Updated 3 years ago
- ☆18Oct 28, 2025Updated 3 months ago
- [ICLR'26] EduVisAgent: A Benchmark and Multi-Agent Framework for Pedagogical Visualization☆28Aug 5, 2025Updated 6 months ago
- ☆20Apr 16, 2025Updated 10 months ago
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model☆42Aug 4, 2024Updated last year
- ☆46Nov 8, 2024Updated last year
- SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis☆68Jul 24, 2025Updated 6 months ago
- [EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality☆19Mar 4, 2025Updated 11 months ago
- ☆16Jul 23, 2024Updated last year
- [NeurIPS'24] CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models☆77Dec 4, 2024Updated last year
- ☆15Feb 21, 2024Updated last year
- ☆27Feb 7, 2025Updated last year
- Official implementation of Adaptive Feature Transfer (AFT)☆23Jun 12, 2024Updated last year
- [ICML 2025] From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories and Applications☆52Oct 30, 2025Updated 3 months ago
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆43Jun 28, 2024Updated last year
- Test-time Scaling for VAR models☆31Sep 19, 2025Updated 4 months ago
- [ACM MM 2025] MLLMs for Aesthetics Reasoning☆23Jan 5, 2026Updated last month
- A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models☆28Nov 25, 2024Updated last year
- Awesome autoregressive vision foundation models☆25Dec 24, 2024Updated last year
- An Open-source Factuality Evaluation Demo for LLMs☆32Aug 10, 2025Updated 6 months ago
- [Technical Report] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with …☆63Oct 9, 2024Updated last year
- [ICML'25] MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization☆67Jun 5, 2025Updated 8 months ago
- [ICLR'25] MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models☆301Jan 22, 2025Updated last year
- Official code repository of paper titled "Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Visio…☆31May 11, 2025Updated 9 months ago
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Mar 22, 2024Updated last year
- A Text2SQL benchmark for evaluation of Large Language Models☆41Feb 8, 2026Updated last week
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆13Jun 28, 2025Updated 7 months ago
- [NeurIPS 2024] MoVA: Adapting Mixture of Vision Experts to Multimodal Context☆173Sep 25, 2024Updated last year
- [EMNLP 2025] Distill Visual Chart Reasoning Ability from LLMs to MLLMs☆59Aug 25, 2025Updated 5 months ago