nttmdlab-nlp/SlideVQA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/nttmdlab-nlp/SlideVQA)

nttmdlab-nlp / SlideVQA

SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)

☆106

Alternatives and similar repositories for SlideVQA

Users that are interested in SlideVQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mayubo2333 / MMLongBench-Doc
View on GitHub
Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations
☆149Sep 28, 2025Updated 9 months ago
rubenpt91 / MP-DocVQA-Framework
View on GitHub
☆72Jan 9, 2024Updated 2 years ago
uakarsh / latr
View on GitHub
Implementation of LaTr: Layout-aware transformer for scene-text VQA,a novel multimodal architecture for Scene Text Visual Question Answer…
☆56Updated this week
nttmdlab-nlp / InstructDoc
View on GitHub
InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions (AAAI2024)
☆162May 31, 2024Updated 2 years ago
uakarsh / TiLT-Implementation
View on GitHub
Implementation of the paper: Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer.
☆18Apr 23, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
HAWLYQ / Qc-TextCap
View on GitHub
☆16Dec 25, 2021Updated 4 years ago
NiteshMethani / PlotQA
View on GitHub
Dataset introduced in PlotQA: Reasoning over Scientific Plots
☆83Jun 20, 2023Updated 3 years ago
manzoku23 / PersonaGeneration
View on GitHub
Create Persona dataset from reddit en movie category comment
☆11Aug 6, 2021Updated 4 years ago
WenjinW / LATIN-Prompt
View on GitHub
☆52May 28, 2024Updated 2 years ago
OpenGVLab / MM-NIAH
View on GitHub
[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…
☆126Nov 25, 2024Updated last year
Wangpeiyi9979 / HCL-Text2AMR
View on GitHub
Code for ACL22 short Paper "Hierarchical Curriculum Learning for AMR Parsing"
☆13Jun 1, 2022Updated 4 years ago
vis-nlp / ChartQA
View on GitHub
☆260Apr 18, 2025Updated last year
vis-nlp / OpenCQA
View on GitHub
☆13Jun 20, 2023Updated 3 years ago
Yuliang-Liu / MultimodalOCR
View on GitHub
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
☆873Updated this week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
microsoft / UDOP
View on GitHub
☆250Jan 22, 2023Updated 3 years ago
xinke-wang / Awesome-Text-VQA
View on GitHub
☆188May 8, 2024Updated 2 years ago
CCIIPLab / DPT
View on GitHub
The code of IJCAI2022 paper, Declaration-based Prompt Tuning for Visual Question Answering
☆20May 10, 2022Updated 4 years ago
svjack / docvqa-gen
View on GitHub
Question Answering dataset generator of Document Visual in English and Chinese
☆24Apr 17, 2023Updated 3 years ago
anisha2102 / docvqa
View on GitHub
Document Visual Question Answering
☆130Jul 30, 2020Updated 5 years ago
AILab-UniFI / cte-dataset
View on GitHub
CTE: Contextualized Table Extraction Dataset
☆17Feb 23, 2023Updated 3 years ago
jfkuang / CFAM
View on GitHub
Contrast-guided Feature Adjustment Module for Visual Information Extraction
☆30May 23, 2023Updated 3 years ago
atharsefid / Extractive_Research_Slide_Generation_Using_Windowed_Labeling_Ranking
View on GitHub
☆18Jun 7, 2021Updated 5 years ago
applicaai / CCpdf
View on GitHub
Index of URLs to pdf files all over the internet and scripts
☆25May 2, 2023Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
nttmdlab-nlp / VisualMRC
View on GitHub
VisualMRC: Machine Reading Comprehension on Document Images (AAAI2021)
☆57Mar 31, 2025Updated last year
sbintuitions / flexeval
View on GitHub
Flexible evaluation tool for language models
☆61Updated this week
SCUT-DLVCLab / Document-AI-Recommendations
View on GitHub
Algorithms, papers, datasets, performance comparisons for Document AI.
☆209Mar 1, 2025Updated last year
VinhLoiIT / vietnamese-htr
View on GitHub
Vietnamese handwritten text recognition system
☆18May 2, 2021Updated 5 years ago
ZephyrZhuQi / ssbaseline
View on GitHub
Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps[AAAI2021]
☆57Apr 5, 2022Updated 4 years ago
herobd / dessurt
View on GitHub
Official implementation for Dessurt: Document end-to-end self-supervised understanding and recognition transformer
☆62Jan 11, 2023Updated 3 years ago
doc-analysis / ReadingBank
View on GitHub
ReadingBank: A Benchmark Dataset for Reading Order Detection
☆117Aug 26, 2024Updated last year
bytedance / MTVQA
View on GitHub
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. A comprehensive evaluation of multimodal large model multilingua…
☆64May 15, 2025Updated last year
rossumai / docile
View on GitHub
DocILE: Document Information Localization and Extraction Benchmark
☆149Jun 17, 2026Updated last month
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
kyegomez / Kosmos2.5
View on GitHub
My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"
☆75Updated this week
guoyang9 / UnifER
View on GitHub
Official implementation for the MM'22 paper.
☆14Jun 30, 2022Updated 4 years ago
AILab-CVC / SEED-Bench
View on GitHub
(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.
☆366Jan 14, 2025Updated last year
krismuniz / google-kgsearch
View on GitHub
A simple wrapper for Google's Knowledge Graph Search API.
☆14Apr 19, 2017Updated 9 years ago
dondongwon / LPMDataset
View on GitHub
☆54Oct 17, 2023Updated 2 years ago
OSU-slatelab / MapQA
View on GitHub
☆15Jan 9, 2026Updated 6 months ago
ChenyuGAO-CS / SMA
View on GitHub
The imdb files with SBD-Trans OCR for TextVQA dataset.
☆11Nov 30, 2021Updated 4 years ago