SpursGoZmy / Table-LLaVALinks

Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and Pre-Train Dataset for table understanding and develop a generalist tabular MLLM named Table-LLaVA.

☆218

Alternatives and similar repositories for Table-LLaVA

Users that are interested in Table-LLaVA are comparing it to the libraries listed below

Sorting:

harrytea / Awesome-Document-Understanding
Document Artifical Intelligence
☆189Updated 3 weeks ago
LukeForeverYoung / UReader
☆141Updated last year
LingyvKong / OneChart
[ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"
☆247Updated 6 months ago
ucaslcl / Fox
official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"
☆171Updated last year
mayubo2333 / MMLongBench-Doc
Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations
☆100Updated 3 weeks ago
yh-hust / PDF-Wukong
【ArXiv】PDF-Wukong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling
☆126Updated 4 months ago
tingxueronghua / ChartLlama-code
☆249Updated last year
khuangaf / Awesome-Chart-Understanding
A curated list of recent and past chart understanding work based on our IEEE TKDE survey paper: From Pixels to Insights: A Survey on Auto…
☆224Updated 4 months ago
Ucas-HaoranWei / Vary-tiny-600k
Vary-tiny codebase upon LAVIS （for training from scratch）and a PDF image-text pairs data (about 600k including English/Chinese)
☆86Updated last year
lfy79001 / TableQAKit
A Toolkit for Table-based Question Answering
☆114Updated 2 years ago
Alibaba-NLP / OmniSearch
Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
☆384Updated 6 months ago
sakura2233565548 / TabPedia
This repository is the codebase of TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy
☆46Updated last year
LinWeizheDragon / Retrieval-Augmented-Visual-Question-Answering
This is the official repository for Retrieval Augmented Visual Question Answering
☆238Updated 10 months ago
OpenGVLab / ChartAst
[ACL 2024] ChartAssistant is a chart-based vision-language model for universal chart comprehension and reasoning.
☆130Updated last year
PanguIR / MRAGSurvey
A Survey of Multimodal Retrieval-Augmented Generation
☆19Updated 6 months ago
RUCKBReasoning / TableLLM
TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios
☆234Updated 2 months ago
Yuliang-Liu / MultimodalOCR
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
☆733Updated 3 months ago
LayTextLLM / LayTextLLM
☆98Updated 10 months ago
zhangfaen / finetune-Qwen2-VL
☆375Updated 8 months ago
Alibaba-NLP / VRAG
Repo for "VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforce…
☆377Updated last week
LinWeizheDragon / FLMR
The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.
☆100Updated 4 months ago
Alpha-Innovator / StructEqTable-Deploy
A High-efficiency Open-source Toolkit for Table-to-Latex Task
☆264Updated 10 months ago
OpenBMB / RAGEval
☆200Updated 6 months ago
liunian-Jay / MU-GOT
PDF解析工具：GOT的vLLM加速实现，MinerU做布局识别裁剪、GOT做表格公式解析，实现RAG中的pdf解析
☆64Updated 11 months ago
FuxiaoLiu / MMC
[NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning
☆96Updated 9 months ago
HCIILAB / M6Doc
☆156Updated 5 months ago
vis-nlp / ChartQA
☆221Updated 6 months ago
zjysteven / lmms-finetune
A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision,…
☆345Updated 8 months ago
infly-ai / INF-MLLM
☆95Updated last month
pengr / LLM-Synthetic-Data
A live reading list for LLM data synthesis (Updated to July, 2025).
☆387Updated 2 months ago