Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations
☆129Sep 28, 2025Updated 5 months ago
Alternatives and similar repositories for MMLongBench-Doc
Users that are interested in MMLongBench-Doc are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆41Jul 28, 2025Updated 7 months ago
- About Data and Codes for EMNLP 2023 System Demo Paper "QACHECK: A Demonstration System for Question-Guided Multi-Hop Fact-Checking"☆19Dec 19, 2023Updated 2 years ago
- Official repository of MMDU dataset☆104Sep 29, 2024Updated last year
- A summary of must-read papers for Neural Question Generation (NQG)☆14Nov 14, 2020Updated 5 years ago
- ☆40Aug 4, 2025Updated 7 months ago
- ☆32Aug 30, 2024Updated last year
- Crawled Wikipedia Tables with Passages☆13Aug 19, 2021Updated 4 years ago
- [CVPR 2025] DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding☆29Dec 18, 2025Updated 3 months ago
- Extending context length of visual language models☆12Dec 18, 2024Updated last year
- The Dataset and Official Implementation for <The ELCo Dataset: Bridging Emoji and Lexical Composition> @ LREC-COLING 2024☆16May 11, 2024Updated last year
- ACL'2023: Few-shot Event Detection: An Empirical Study and a Unified View☆11Mar 13, 2024Updated 2 years ago
- ☆16Jul 23, 2024Updated last year
- [ICLR 2026] An official implementation of "SIM-CoT: Supervised Implicit Chain-of-Thought"☆188Feb 4, 2026Updated last month
- LMM for VQA, tcsvt version☆10Jul 19, 2024Updated last year
- [CVPR 2026] Official Code for "ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning"☆85Feb 13, 2026Updated last month
- [NeurIPS 2025] Official implementation of HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance☆86Sep 18, 2025Updated 6 months ago
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.☆268Mar 17, 2026Updated last week
- The official implementation of RAR☆91Dec 9, 2025Updated 3 months ago
- Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.☆32Feb 26, 2025Updated last year
- This is the official repository for Retrieval Augmented Visual Question Answering☆247Dec 19, 2024Updated last year
- ☆31Dec 9, 2023Updated 2 years ago
- TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning☆23Sep 17, 2024Updated last year
- ☆12Jun 20, 2023Updated 2 years ago
- This repository contains source code for the PASTA model, a pre-trained language model for table-based fact verification.☆18Dec 27, 2022Updated 3 years ago
- ☆17Oct 22, 2024Updated last year
- ☆57Jan 23, 2024Updated 2 years ago
- [CVPR 2025] Official implementation of ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way☆48Oct 10, 2025Updated 5 months ago
- [ICLR2025] Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want☆95Dec 1, 2025Updated 3 months ago
- Repository for Teaching Broad Reasoning Skills for Multi-Step QA by Generating Hard Contexts, EMNLP22☆19Jun 23, 2023Updated 2 years ago
- Official implementation of URaG: Unified Retrieval and Generation in Multimodal LLMs for Efficient Long Document Understanding (AAAI 2026…☆38Feb 4, 2026Updated last month
- 📖Curated list about reasoning abilitiy of MLLM, including OpenAI o1, OpenAI o3-mini, and Slow-Thinking.☆13Feb 7, 2025Updated last year
- [NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"☆205Sep 26, 2024Updated last year
- The project page for "SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables"☆23Dec 21, 2023Updated 2 years ago
- DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models☆153Jan 13, 2025Updated last year
- This is the repository for NAACL'25 paper "TART: An Open-Source Tool-Augmented Framework for Explainable Table-based Reasoning"☆55May 3, 2025Updated 10 months ago
- [CVPR 2026] An official implementation of "Think Visually, Reason Textually: Vision-Language Synergy in ARC"☆39Nov 26, 2025Updated 3 months ago
- Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks☆3,920Updated this week
- MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. A comprehensive evaluation of multimodal large model multilingua…☆64May 15, 2025Updated 10 months ago
- Syphus: Automatic Instruction-Response Generation Pipeline☆14Dec 14, 2023Updated 2 years ago