Dataset and scripts for HRDoc
☆41Jun 21, 2023Updated 2 years ago
Alternatives and similar repositories for HRDoc
Users that are interested in HRDoc are comparing it to the libraries listed below
Sorting:
- ☆32Apr 14, 2024Updated last year
- ☆18Jun 7, 2023Updated 2 years ago
- Code for the ICDAR2021 paper "Visual FUDGE: Form Understanding via Dynamic Graph Editing"☆33Mar 4, 2022Updated 4 years ago
- This is the official repository of the revised datasets FUNSD-r and CORD-r, introduced in EMNLP 2023 paper Reading Order Matters: Informa…☆17Mar 20, 2024Updated last year
- Question Answering dataset generator of Document Visual in English and Chinese☆24Apr 17, 2023Updated 2 years ago
- Code for the paper "Abstractive Summarization Guided by Latent Hierarchical Document Structure"☆13May 20, 2023Updated 2 years ago
- Tool to parse wiki tables from the HTML dump of Wikipedia☆11Jun 12, 2022Updated 3 years ago
- ☆156May 8, 2025Updated 9 months ago
- ☆82Apr 12, 2022Updated 3 years ago
- Release for CHART annotation tools used for ICDAR CHART 2019 competition☆28Sep 15, 2023Updated 2 years ago
- ☆51May 28, 2024Updated last year
- Data and code for the paper "CiteWorth: Cite-Worthiness Detection for Improved Scientific Document Understanding"☆14Sep 8, 2022Updated 3 years ago
- The WordScape repository contains code for the WordScape pipeline to create datasets to train document understanding models.☆39Dec 7, 2023Updated 2 years ago
- ☆40Aug 18, 2021Updated 4 years ago
- ☆17Jan 23, 2021Updated 5 years ago
- [Paper] Code for the EMNLP2023 (Findings) paper "Global Structure Knowledge-Guided Relation Extraction Method for Visually-Rich Document"☆17Dec 1, 2023Updated 2 years ago
- Context-Aware Chart Element Detection☆50Sep 25, 2025Updated 5 months ago
- Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models, EMNLP 2023☆46Jun 11, 2024Updated last year
- Official Implementation of Web-based Visual Corpus Builder (Webvicob), ICDAR 2023☆109Oct 24, 2023Updated 2 years ago
- TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning☆23Sep 17, 2024Updated last year
- ☆22May 5, 2021Updated 4 years ago
- ☆57Jan 23, 2024Updated 2 years ago
- This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on table detection and table structure …☆182Sep 15, 2021Updated 4 years ago
- ☆69Jan 9, 2024Updated 2 years ago
- ☆72Mar 10, 2025Updated 11 months ago
- Released code for our ICLR23 paper.☆66Mar 23, 2023Updated 2 years ago
- The Learnable Typewriter: A Generative Approach to Text Line Analysis☆34Oct 31, 2024Updated last year
- Tools for content datamining and NLP at scale☆44Jun 20, 2024Updated last year
- [NAACL 2024] Making Language Models Better Tool Learners with Execution Feedback☆43Mar 14, 2024Updated last year
- NNVisBuilder and some cases including KD-t☆12Nov 18, 2023Updated 2 years ago
- A large scale camera-taken table detection and recognition dataset.☆149Jul 21, 2025Updated 7 months ago
- WikiTableSet: A largest publicly available image-based table recognition dataset in three languages built from Wikipedia☆32Jun 12, 2025Updated 8 months ago
- InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions (AAAI2024)☆162May 31, 2024Updated last year
- ☆10Oct 2, 2024Updated last year
- Project that regroup the state-of-the-art knowledge distillation approaches for unsupervised anomaly detection☆13Oct 10, 2025Updated 4 months ago
- Directed masked autoencoders☆14Feb 20, 2026Updated last week
- PyTorch Implementation of the paper "Defining and Quantifying the Emergence of Sparse Concepts in DNNs" (CVPR 2023)☆12Dec 24, 2023Updated 2 years ago
- ☆161Dec 27, 2022Updated 3 years ago
- OCR toolbox from Davar-Lab☆759Nov 16, 2023Updated 2 years ago