NiteshMethani / PlotQALinks

Dataset introduced in PlotQA: Reasoning over Scientific Plots

☆82

Alternatives and similar repositories for PlotQA

Users that are interested in PlotQA are comparing it to the libraries listed below

Sorting:

vis-nlp / UniChart
☆82Updated last year
nttmdlab-nlp / SlideVQA
SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)
☆103Updated 8 months ago
rubenpt91 / MP-DocVQA-Framework
☆67Updated last year
vis-nlp / Chart-to-text
☆120Updated last year
FuxiaoLiu / MMC
[NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning
☆96Updated 11 months ago
vis-nlp / ChartQA
☆230Updated 7 months ago
tingyaohsu / SciCap
SciCap Dataset
☆56Updated 4 years ago
vis-nlp / OpenCQA
☆11Updated 2 years ago
naver-ai / tablevqabench
☆45Updated last year
InternScience / SimChart9K
The proposed simulated dataset consisting of 9,536 charts and associated data annotations in CSV format.
☆26Updated last year
mayubo2333 / MMLongBench-Doc
Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations
☆108Updated 2 months ago
princeton-nlp / CharXiv
[NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
☆134Updated 7 months ago
huggingface / docmatix
A huge dataset for Document Visual Question Answering
☆20Updated last year
naver-ai / cream
Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models, EMNLP 2023
☆46Updated last year
bytedance / MTVQA
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. A comprehensive evaluation of multimodal large model multilingua…
☆63Updated 6 months ago
huggingface / OBELICS
Code used for the creation of OBELICS, an open, massive and curated collection of interleaved image-text web documents, containing 141M d…
☆210Updated last year
TIGER-AI-Lab / UniIR
Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)
☆169Updated last year
OpenGVLab / ChartAst
[ACL 2024] ChartAssistant is a chart-based vision-language model for universal chart comprehension and reasoning.
☆131Updated last year
PLUM-Lab / MultiInstruct
MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning
☆134Updated 2 years ago
HZQ950419 / Math-LLaVA
Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models
☆92Updated last year
tianyu-z / VCR
Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.
☆31Updated 9 months ago
thunlp / Muffin
☆66Updated last year
gregor-ge / mBLIP
☆87Updated last year
open-vision-language / oven
☆41Updated 2 years ago
due-benchmark / baselines
The code related to the baselines from NeurIPS 2021 paper "DUE: End-to-End Document Understanding Benchmark."
☆36Updated 2 years ago
mlfoundations / VisIT-Bench
☆50Updated 2 years ago
manoja328 / TallyQA_dataset
TallyQA: Answering Complex Counting Questions dataset
☆28Updated last year
OFA-Sys / TouchStone
Touchstone: Evaluating Vision-Language Models by Language Models
☆83Updated last year
princeton-nlp / PTP
Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073
☆31Updated last year
FreedomIntelligence / MLLM-Bench
MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria
☆72Updated last year