jfma-USTC / HRDoc
Dataset and scripts for HRDoc
☆35Updated last year
Alternatives and similar repositories for HRDoc:
Users that are interested in HRDoc are comparing it to the libraries listed below
- CTE: Contextualized Table Extraction Dataset☆17Updated last year
- Datasets and Evaluation Scripts for CompHRDoc☆32Updated 10 months ago
- ☆111Updated 11 months ago
- The WordScape repository contains code for the WordScape pipeline to create datasets to train document understanding models.☆33Updated last year
- Example codebase for fine-tuning layoutLMv3 on DocVQA☆49Updated 2 years ago
- ☆31Updated 9 months ago
- High-Performance Transformers for Table Structure Recognition Need Early Convolutions☆42Updated 9 months ago
- My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"☆71Updated this week
- Evaluation of the Optical Character Recognition (OCR) capabilities of GPT-4V(ision)☆121Updated last year
- ☆128Updated 11 months ago
- an unofficial code for augment-XY-CUT in XYLayoutLM☆28Updated 2 years ago
- ☆74Updated last month
- Dataset of PNG images from synthetically generated table layouts with annotations in JSONL files☆132Updated last year
- ☆58Updated last year
- ☆50Updated 8 months ago
- ☆79Updated 2 years ago
- ICDAR 2024 Table OCR Model☆28Updated last month
- An unofficial Pytorch implementation of ERNIE-Layout which is originally released through PaddleNLP.☆102Updated last year
- MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. A comprehensive evaluation of multimodal large model multilingua…☆51Updated last month
- An official implementation of paper "Paragraph2Graph: A Language-independent GNN-based framework for layout analysis"☆76Updated last year
- Vary-tiny codebase upon LAVIS (for training from scratch)and a PDF image-text pairs data (about 600k including English/Chinese)☆76Updated 4 months ago
- Repository for the KVP10k dataset☆15Updated 4 months ago
- ☆16Updated last year
- Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models, EMNLP 2023☆44Updated 7 months ago
- Table Structure Recognition☆65Updated last year
- Code for: U. Khan, S. Zahid, M.A. Ali, A. Ul-Hasan and F. Shafait, TabAug: Data Driven Augmentation for Enhanced Table Structure Recognit…☆7Updated 3 years ago
- [Paper] Code for the EMNLP2023 (Findings) paper "Global Structure Knowledge-Guided Relation Extraction Method for Visually-Rich Document"☆16Updated last year
- MPB (Miner-PDF-Benchmark) is an end-to-end PDF document comprehension evaluation suite designed for large-scale model data scenarios.☆20Updated last month
- Contrast-guided Feature Adjustment Module for Visual Information Extraction☆28Updated last year