Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and Pre-Train Dataset for table understanding and develop a generalist tabular MLLM named Table-LLaVA.
☆225Jun 12, 2025Updated 8 months ago
Alternatives and similar repositories for Table-LLaVA
Users that are interested in Table-LLaVA are comparing it to the libraries listed below
Sorting:
- WikiTableSet: A largest publicly available image-based table recognition dataset in three languages built from Wikipedia☆32Jun 12, 2025Updated 8 months ago
- We collect papers about "large language models (LLM) for table-related tasks", e.g., using LLM for Table QA task. “表格+LLM”相关论文整理☆610Dec 15, 2025Updated 2 months ago
- A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating…☆225Sep 9, 2024Updated last year
- ☆102Dec 23, 2024Updated last year
- ICDAR 2024 Table OCR Model☆39Updated this week
- 本项目旨在收集开源的表格智能任务数据集(比如表格问答、表格-文本生成等),将原始数据整理为指令微调格 式的数据并微调LLM,进而增强LLM对于表格数据的理解,最终构建出专门面向表格智能任务的大型语言模型。☆642Apr 22, 2024Updated last year
- UniTable: Towards a Unified Table Foundation Model☆525Jun 4, 2024Updated last year
- This repository is the codebase of TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy☆50Oct 16, 2024Updated last year
- Dataset of PNG images from synthetically generated table layouts with annotations in JSONL files☆152Sep 17, 2025Updated 5 months ago
- A curated list of recent and past chart understanding work based on our IEEE TKDE survey paper: From Pixels to Insights: A Survey on Auto…☆232Dec 17, 2025Updated 2 months ago
- TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning☆23Sep 17, 2024Updated last year
- A large scale camera-taken table detection and recognition dataset.☆149Jul 21, 2025Updated 7 months ago
- ☆142Feb 13, 2024Updated 2 years ago
- [MM'2024] PEneo, an effective algorithm for key-value pair extraction from form-like documents, designed for real-world applications.☆41Apr 7, 2025Updated 10 months ago
- [ICCV2025] A Token-level Text Image Foundation Model for Document Understanding☆132Aug 27, 2025Updated 6 months ago
- Dataset introduced in PlotQA: Reasoning over Scientific Plots☆84Jun 20, 2023Updated 2 years ago
- ☆40Jun 15, 2024Updated last year
- official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"☆195May 31, 2024Updated last year
- Datasets and Evaluation Scripts for CompHRDoc☆56Feb 25, 2025Updated last year
- Document Artifical Intelligence☆201Sep 28, 2025Updated 5 months ago
- A High-efficiency Open-source Toolkit for Table-to-Latex Task☆275Dec 6, 2025Updated 2 months ago
- SPRINT: Script-agnostic Structure Recognition in Tables☆16Mar 26, 2025Updated 11 months ago
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.☆312Aug 15, 2025Updated 6 months ago
- A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team…☆1,820Apr 9, 2025Updated 10 months ago
- RoDLA: Benchmarking the Robustness of Document Layout Analysis Models☆39Mar 26, 2025Updated 11 months ago
- Read Ten Lines at One Glance: Line-Aware Semi-Autoregressive Transformer for Multi-Line Handwritten Mathematical Expression Recognition☆28Aug 29, 2023Updated 2 years ago
- ☆48Feb 7, 2025Updated last year
- Official implementation of UPOCR: Towards unified pixel-level OCR interface (ICML 2024)☆67Jun 6, 2024Updated last year
- MTL-TabNet: Multi-task Learning based Model for Image-based Table Recognition☆103May 30, 2024Updated last year
- Table Structure Recognition☆81Mar 11, 2023Updated 2 years ago
- mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding☆2,370May 30, 2025Updated 9 months ago
- [ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.☆1,897Dec 30, 2024Updated last year
- ☆78Aug 7, 2023Updated 2 years ago
- [ACL 2022] A hierarchical table dataset for question answering and data-to-text generation.☆107Dec 16, 2025Updated 2 months ago
- [ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"☆260Apr 14, 2025Updated 10 months ago
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆306Sep 10, 2024Updated last year
- ☆31Apr 8, 2025Updated 10 months ago
- ☆156May 8, 2025Updated 9 months ago
- [NAACL 2024] Visually Guided Generative Text-Layout Pre-training for Document Intelligence☆149Sep 10, 2024Updated last year