Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations
☆124Sep 28, 2025Updated 5 months ago
Alternatives and similar repositories for MMLongBench-Doc
Users that are interested in MMLongBench-Doc are comparing it to the libraries listed below
Sorting:
- Official repository of MMDU dataset☆104Sep 29, 2024Updated last year
- ☆39Jul 28, 2025Updated 7 months ago
- Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.☆32Feb 26, 2025Updated last year
- SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)☆105Mar 31, 2025Updated 11 months ago
- This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"☆31Dec 23, 2024Updated last year
- MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. A comprehensive evaluation of multimodal large model multilingua…☆63May 15, 2025Updated 9 months ago
- TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning☆23Sep 17, 2024Updated last year
- Extending context length of visual language models☆12Dec 18, 2024Updated last year
- ☆19Sep 16, 2025Updated 5 months ago
- ☆19Sep 11, 2024Updated last year
- The proposed simulated dataset consisting of 9,536 charts and associated data annotations in CSV format.☆26Feb 22, 2024Updated 2 years ago
- [Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.☆113Jul 27, 2024Updated last year
- ☆69Jan 9, 2024Updated 2 years ago
- ☆39Aug 4, 2025Updated 7 months ago
- DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models☆152Jan 13, 2025Updated last year
- [ICCV 2025] The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration R…☆111Jul 9, 2025Updated 7 months ago
- ☆16Jul 23, 2024Updated last year
- About Data and Codes for EMNLP 2023 System Demo Paper "QACHECK: A Demonstration System for Question-Guided Multi-Hop Fact-Checking"☆19Dec 19, 2023Updated 2 years ago
- [ICLR2025] Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want☆94Dec 1, 2025Updated 3 months ago
- ☆19Jan 10, 2025Updated last year
- DocBench: A Benchmark for Evaluating LLM-based Document Reading Systems☆68Sep 29, 2024Updated last year
- UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition☆458Sep 28, 2025Updated 5 months ago
- [ICLR2025] MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models☆94Sep 14, 2024Updated last year
- ☆57Jan 23, 2024Updated 2 years ago
- [ICCV 2023] Code base for Revisiting Scene Text Recognition: A Data Perspective☆201Nov 1, 2023Updated 2 years ago
- On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)☆796Jul 5, 2025Updated 8 months ago
- 生成训练文本检测数据集☆12Jul 1, 2020Updated 5 years ago
- [NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"☆203Sep 26, 2024Updated last year
- [NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning☆95Jan 7, 2025Updated last year
- Chinese Vision-Language Understanding Evaluation☆23Dec 26, 2024Updated last year
- The official implementation of RAR☆92Dec 9, 2025Updated 2 months ago
- ☆12Jun 20, 2023Updated 2 years ago
- ☆11Jan 3, 2024Updated 2 years ago
- ACL'2023: Few-shot Event Detection: An Empirical Study and a Unified View☆11Mar 13, 2024Updated last year
- ☆14Jan 24, 2025Updated last year
- triton ver of gqa flash attn, based on the tutorial☆12Aug 4, 2024Updated last year
- An offline evaluation framework for sequence-based recommender systems☆13May 17, 2019Updated 6 years ago
- ☆11Jan 19, 2025Updated last year
- ☆11Nov 5, 2024Updated last year