bytedance/WildDoc

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/bytedance/WildDoc)

bytedance / WildDoc

The official repo for “WildDoc: How Far Are We from Achieving Comprehensive and Robust Document Understanding in the Wild?“

☆74

Alternatives and similar repositories for WildDoc

Users that are interested in WildDoc are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

caipeng328 / ForCenNet
View on GitHub
☆81Jul 31, 2025Updated 11 months ago
FelixHertlein / doc-matcher
View on GitHub
Inference, training and evaluation code for our paper "DocMatcher: Document Image Dewarping via Structural and Textual Line Matching" (WA…
☆55Jul 1, 2025Updated last year
RylonW / DocNLC
View on GitHub
Official code for DocNLC: A Document Image Enhancement Framework with Normalized and Latent Contrastive Representation for Multiple Degra…
☆44Mar 20, 2026Updated 4 months ago
Topdu / DocPTBench
View on GitHub
Benchmarking End-to-End Photographed Document Parsing and Translation
☆17Dec 4, 2025Updated 7 months ago
DrLuo / RTM
View on GitHub
The official repository of Real Text Manipulation (RTM)
☆46Mar 18, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
FelixHertlein / inv3d
View on GitHub
Project page for the ICDAR 2023 Paper "Inv3D: a high-resolution 3D invoice dataset for template-guided single-image document unwarping".
☆13Dec 21, 2023Updated 2 years ago
ZZZHANG-jx / WMeter-Reader
View on GitHub
[TIM 2025] Towards Accurate Readings of Water Meters by Eliminating Transition Error: New Dataset and Effective Solution
☆19Mar 5, 2025Updated last year
BunnySoCrazy / LA-DocFlatten
View on GitHub
Code and Dataset for our paper: Layout-Aware Single-Image Document Flattening
☆24Dec 16, 2024Updated last year
harrytea / UDoc-GAN
View on GitHub
Official PyTorch implementation for ACM MM22 "UDoc-GAN: Unpaired Document Illumination Correction with Background Light Prior"
☆25Aug 5, 2024Updated last year
chaoyunwang / AADD
View on GitHub
AAAI2026 paper code: Axis-Aligned Document Dewarping
☆17Mar 9, 2026Updated 4 months ago
shannanyinxiang / UPOCR
View on GitHub
Official implementation of UPOCR: Towards unified pixel-level OCR interface (ICML 2024)
☆69Jun 6, 2024Updated 2 years ago
Token-family / TokenFD
View on GitHub
[ICCV2025] A Token-level Text Image Foundation Model for Document Understanding
☆135Aug 27, 2025Updated 10 months ago
TenMilesLotus / DTSM
View on GitHub
Code and data for the paper: DTSM: Toward Dense Table Structure Recognition with Text Query Encoder and Adjacent Feature Aggregator
☆13Apr 28, 2024Updated 2 years ago
qcf-568 / OSTF
View on GitHub
[AAAI2025] Revisiting Tampered Scene Text Detection in the Era of Generative AI
☆72Jun 7, 2026Updated last month
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
lcy0604 / QT-TextSR
View on GitHub
This repository is the implementation of "QT-TextSR: Enhancing scene text image super-resolution via efficient interaction with text reco…
☆20Jul 9, 2025Updated last year
ZZZHANG-jx / Recommendations-Document-Image-Processing
View on GitHub
This repository contains a paper collection of the methods for document image processing, including appearance enhancement, deshadowing, …
☆394Jun 1, 2026Updated last month
sakura2233565548 / TabPedia
View on GitHub
This repository is the codebase of TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy
☆51Oct 16, 2024Updated last year
KahimWong / ADCD-Net
View on GitHub
[ICCV'25] ADCD-Net: Robust Document Image Forgery Localization via Adaptive DCT Feature and Hierarchical Content Disentanglement
☆26Mar 29, 2026Updated 3 months ago
irisXcoding / DocReal
View on GitHub
DocReal: Robust Document Dewarping of Real-Life Images via Attention-Enhanced Control Point Prediction
☆30Jun 28, 2023Updated 3 years ago
xiaomore / Document-Image-Dewarping
View on GitHub
☆69Nov 30, 2023Updated 2 years ago
SCUT-DLVCLab / OCR-Reasoning
View on GitHub
[ICLR 2026] OCR-Reasoning Benchmark: Unveiling the True Capabilities of MLLMs in Complex Text-Rich Image Reasoning
☆76May 26, 2026Updated last month
qcf-568 / DocTamper
View on GitHub
[CVPR2023] Towards Robust Tampered Text Detection in Document Image: New Dataset and New Solution
☆202Feb 4, 2026Updated 5 months ago
sunzhihao18 / ForgerySleuth
View on GitHub
☆30May 22, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ZZZHANG-jx / DocKylin
View on GitHub
[AAAI 2025] DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming
☆36Jun 1, 2025Updated last year
fh2019ustc / DeepEraser
View on GitHub
The official code for “DeepEraser: Deep Iterative Context Mining for Generic Text Eraser”, TMM, 2024.
☆53Aug 26, 2024Updated last year
cvlab-stonybrook / DocIIW
View on GitHub
Repository for Intrinsic Decomposition of Document Images In-the-Wild (BMVC '20)
☆51May 14, 2023Updated 3 years ago
bytedance / E2STR
View on GitHub
The official code for the CVPR 2024 paper: Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer
☆55Jun 14, 2024Updated 2 years ago
chenxn2020 / GOSE
View on GitHub
[Paper] Code for the EMNLP2023 (Findings) paper "Global Structure Knowledge-Guided Relation Extraction Method for Visually-Rich Document"
☆17Dec 1, 2023Updated 2 years ago
FelixHertlein / illtrtemplate-model
View on GitHub
Code from our paper "Template-guided Illumination Correction for Document Images with Imperfect Geometric Reconstruction " (ICCVW) 2023.
☆29Feb 7, 2024Updated 2 years ago
ZZZHANG-jx / DocAligner
View on GitHub
[PR 2025] DocAligner: Automating the Annotation of Photographed Documents Through Real-virtual Alignment
☆110Aug 4, 2025Updated 11 months ago
CXH-Research / StainRestorer
View on GitHub
[WACV 2025] High-Fidelity Document Stain Removal via A Large-Scale Real-World Dataset and A Memory-Augmented Transformer
☆23Jan 14, 2026Updated 6 months ago
ZZZHANG-jx / DocRes
View on GitHub
[CVPR 2024] DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks
☆628Aug 3, 2025Updated 11 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Tencent / POINTS-Reader
View on GitHub
☆197Dec 7, 2025Updated 7 months ago
ZZZHANG-jx / GCDRNet
View on GitHub
[TAI 2023] Appearance Enhancement for Camera-captured Document Images in the Wild
☆58Aug 28, 2025Updated 10 months ago
MaxKinny / TabRecSet
View on GitHub
A large scale camera-taken table detection and recognition dataset.
☆151Apr 9, 2026Updated 3 months ago
bzluan / TextCoT
View on GitHub
[ACM TOMM] Official implementation of "TextCoT: Zoom-In for Enhanced Multimodal Text-Rich Image Understanding"
☆45Feb 27, 2026Updated 4 months ago
LayTextLLM / LayTextLLM
View on GitHub
☆103Dec 23, 2024Updated last year
williamyang1991 / Awesome-Artistic-Typography
View on GitHub
☆63Jul 23, 2024Updated 2 years ago
HCIILAB / M6Doc
View on GitHub
☆164May 8, 2025Updated last year