The WordScape repository contains code for the WordScape pipeline to create datasets to train document understanding models.
☆39Dec 7, 2023Updated 2 years ago
Alternatives and similar repositories for WordScape
Users that are interested in WordScape are comparing it to the libraries listed below
Sorting:
- ☆69Jan 9, 2024Updated 2 years ago
- Index of URLs to pdf files all over the internet and scripts☆25May 2, 2023Updated 2 years ago
- Binarizing Documents by Leveraging both Space and Frequency. (ICDAR 2024)☆15May 15, 2025Updated 10 months ago
- InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions (AAAI2024)☆162May 31, 2024Updated last year
- Dataset and scripts for HRDoc☆41Jun 21, 2023Updated 2 years ago
- Create TensorRT-runtime for vietocr☆12Jun 8, 2021Updated 4 years ago
- ☆37Jan 26, 2026Updated last month
- "Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs" 2023☆16Nov 28, 2024Updated last year
- ☆40Aug 18, 2021Updated 4 years ago
- HTML in Python☆12Jul 19, 2024Updated last year
- JSON Schema format for storing datasets details, documents processed contents, and documents annotations in the document understanding do…☆13Nov 5, 2024Updated last year
- Read Ten Lines at One Glance: Line-Aware Semi-Autoregressive Transformer for Multi-Line Handwritten Mathematical Expression Recognition☆28Aug 29, 2023Updated 2 years ago
- Reproduction paper --- PDFTriage : Question Answering over Long, Structured Documents☆43Jan 16, 2024Updated 2 years ago
- VisualMRC: Machine Reading Comprehension on Document Images (AAAI2021)☆57Mar 31, 2025Updated 11 months ago
- ☆27Feb 20, 2024Updated 2 years ago
- ☆82Apr 12, 2022Updated 3 years ago
- Repository for the KVP10k dataset☆22Sep 18, 2025Updated 6 months ago
- v1: Learning to Point Visual Tokens for Multimodal Grounded Reasoning☆19Oct 6, 2025Updated 5 months ago
- The most comprehensive Chinese Telegraph Code table☆12Jul 5, 2015Updated 10 years ago
- PyTorch implementation of BMVC2022 paper Masked Vision-Language Transformers for Scene Text Recognition☆29Nov 11, 2022Updated 3 years ago
- It's the code for the paper Pushing the Performance Limit of Scene Text Recognizer without Human Annotation, CVPR 2022.☆28Jul 6, 2022Updated 3 years ago
- ☆42Sep 2, 2023Updated 2 years ago
- This is the official repository of the revised datasets FUNSD-r and CORD-r, introduced in EMNLP 2023 paper Reading Order Matters: Informa…☆17Mar 20, 2024Updated 2 years ago
- Repo for the paper: Towards Few-shot Entity Recognition in Document Images:A Label-aware Sequence-to-Sequence Framework☆14May 31, 2023Updated 2 years ago
- A python implementation of PROCLUS: PROjected CLUStering algorithm.☆10Jan 12, 2015Updated 11 years ago
- ☆10Aug 5, 2019Updated 6 years ago
- ACM Multimedia 2023: DocDiff: Document Enhancement via Residual Diffusion Models. Also contains 1597 red seals in Chinese scenes, along w…☆338Aug 22, 2024Updated last year
- Code for paper 'Data-Efficient FineTuning'☆28May 24, 2023Updated 2 years ago
- An unofficial Pytorch implementation of ERNIE-Layout which is originally released through PaddleNLP.☆107Nov 15, 2023Updated 2 years ago
- 个人简历☆13May 30, 2021Updated 4 years ago
- ☆11Jul 31, 2022Updated 3 years ago
- Official implementation of the ANLS* metric☆22Mar 11, 2026Updated last week
- OCR-VQGAN, a discrete image encoder (tokenizer and detokenizer) for figure images in Paper2Fig100k dataset. Implementation of OCR Percept…☆83Jan 30, 2023Updated 3 years ago
- ☆14Jan 21, 2019Updated 7 years ago
- The official code for “DeepEraser: Deep Iterative Context Mining for Generic Text Eraser”, TMM, 2024.☆48Aug 26, 2024Updated last year
- A Collection of Pydantic Models to Abstract IRL☆38Dec 10, 2025Updated 3 months ago
- Online visual analytics tool designed to investigate how attention maps in transformer models behaves, and build hypothesis on those mode…☆10Nov 10, 2021Updated 4 years ago
- [CVPR 2026 (Findings) 🔥🔥] Self Evolving Large Multimodal Models with Continuous Rewards☆21Mar 5, 2026Updated 2 weeks ago
- DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models☆152Jan 13, 2025Updated last year