hiroi-sora/GapTree_Sort_Algorithm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hiroi-sora/GapTree_Sort_Algorithm)

hiroi-sora / GapTree_Sort_Algorithm

【间隙·树·排序算法】对OCR结果或PDF提取的文本进行版面分析，按人类阅读顺序进行排序。

☆167

Alternatives and similar repositories for GapTree_Sort_Algorithm

Users that are interested in GapTree_Sort_Algorithm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

RapidAI / RapidUnDistort
View on GitHub
修正文档扭曲/模糊/阴影等情况，使用onnx模型简单轻量部署，未来持续跟进最新最好的文档矫正方案和模型,Correct document distortion using a lightweight ONNX model for easy deployment. We wi…
☆105Dec 17, 2025Updated 7 months ago
FreeOCR-AI / layoutreader
View on GitHub
A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.
☆322Aug 15, 2025Updated 11 months ago
Line-Kite / GraphLayoutLM
View on GitHub
☆14Sep 6, 2024Updated last year
RapidAI / RapidLayout
View on GitHub
Analysis of Chinese and English layouts 中英文版面分析
☆275Mar 24, 2026Updated 3 months ago
GreatV / DocTrPP
View on GitHub
DocTr++ in PaddlePaddle
☆56Jul 24, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
RapidAI / RapidOrientation
View on GitHub
文档方向分类
☆221Feb 3, 2026Updated 5 months ago
OKC13 / General-Documents-Layout-parser
View on GitHub
通用版面分析 | 中文文档解析 |Document Layout Analysis | layout paser
☆47Jun 13, 2024Updated 2 years ago
RapidAI / TableStructureRec
View on GitHub
整理目前开源的最优表格识别模型，完善前后处理，模型转换为ONNX | Organize the currently open-source optimal table recognition models, improve pre-processing and post-…
☆954Aug 3, 2025Updated 11 months ago
SWHL / ChineseDocumentPDF
View on GitHub
中文论文、证券类、财报类PDF数据
☆41Jun 13, 2024Updated 2 years ago
buptlihang / CDLA
View on GitHub
CDLA: A Chinese document layout analysis (CDLA) dataset
☆293Sep 13, 2021Updated 4 years ago
Tan-Junwen / awesome-table-structure-recognition
View on GitHub
A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating…
☆232Sep 9, 2024Updated last year
tanguymagne / UVDoc
View on GitHub
Code for the paper "UVDoc: Neural Grid-based Document Unwarping"
☆222Jul 28, 2024Updated last year
AlibabaResearch / AdvancedLiterateMachinery
View on GitHub
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team…
☆1,833Mar 17, 2026Updated 4 months ago
Chingliu / xilou_core
View on GitHub
基于pdfium的pdf/ofd双引擎解析渲染引擎
☆13Oct 15, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
fh2019ustc / DocTr-Plus
View on GitHub
The official code for “Deep Unrestricted Document Image Rectification”, TMM, 2023.
☆527Feb 1, 2026Updated 5 months ago
jiangnanboy / Doc-Image-Tool
View on GitHub
文档图像处理工具(Document image processing tool)，包括漂白 / 文字方向矫正 / 清晰增强 / 笔记去噪美化 / 去阴影 / 扭曲矫正 / 切边增强(DocBleach / TextOrientationCorrection / DocSha…
☆134Aug 27, 2024Updated last year
WenmuZhou / TableGeneration
View on GitHub
通过浏览器渲染生成表格图像
☆238Apr 10, 2024Updated 2 years ago
chenxn2020 / GOSE
View on GitHub
[Paper] Code for the EMNLP2023 (Findings) paper "Global Structure Knowledge-Guided Relation Extraction Method for Visually-Rich Document"
☆17Dec 1, 2023Updated 2 years ago
yongzhuo / layoutlmv3-layoutxlm-chinese
View on GitHub
chinese document classification of layoutlmv3 and layoutxlm
☆45Oct 25, 2022Updated 3 years ago
omarWafaay / MathFormApp
View on GitHub
Application for Math formula detection in image/pdf and then recognition
☆13Jan 14, 2025Updated last year
Sanster / OhMyTable
View on GitHub
Table Structure Recognition
☆28Jul 25, 2024Updated last year
Dawars / DocMAE
View on GitHub
Unofficial implementation of DocMAE (WIP): Document Image Rectification via Self-supervised Representation Learning
☆20Dec 20, 2023Updated 2 years ago
FutureRising007 / Table_Structure_Recognition
View on GitHub
Table Structure Recognition
☆83Mar 11, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
poloclub / unitable
View on GitHub
UniTable: Towards a Unified Table Foundation Model
☆533Apr 21, 2026Updated 2 months ago
rebeccaeexu / RRID
View on GitHub
[ECCV 2024] Image Demoireing in RAW and sRGB Domains
☆17Apr 1, 2026Updated 3 months ago
ZeningLin / PEneo
View on GitHub
[MM'2024] PEneo, an effective algorithm for key-value pair extraction from form-like documents, designed for real-world applications.
☆41Apr 7, 2025Updated last year
zhaominyiz / STIRER
View on GitHub
STIRER: A Unified Model for Low-Resolution Scene Text Image Recovery and Recognition -- ACMMM 2023
☆14Dec 2, 2024Updated last year
JG1VPP / MuTabNet
View on GitHub
ICDAR 2024/2026 Table OCR Model
☆39Jun 16, 2026Updated last month
ZZZHANG-jx / GCDRNet
View on GitHub
[TAI 2023] Appearance Enhancement for Camera-captured Document Images in the Wild
☆58Aug 28, 2025Updated 10 months ago
xiaomore / Document-Image-Dewarping
View on GitHub
☆69Nov 30, 2023Updated 2 years ago
BobLd / DocumentLayoutAnalysis
View on GitHub
Document Layout Analysis resources repos for development with PdfPig.
☆637Oct 1, 2023Updated 2 years ago
GuangtaoLyu / FETNet
View on GitHub
FETNet: Feature Erasing and Transferring Network for Scene Text Removal
☆35Jul 18, 2023Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Wild-Rift / Document-Layout-Analysis
View on GitHub
Tools for extract figure, table, text, .. from a pdf document.
☆35Nov 25, 2020Updated 5 years ago
chenjun2hao / SRN.pytorch
View on GitHub
Unofficial PyTorch implementation of Towards Accurate Scene Text Recognition with Semantic Reasoning Networks
☆191May 12, 2020Updated 6 years ago
yanqiuxia / BERT-PreTrain
View on GitHub
不用tensorflow estimator，分别采用字mask和wwm mask在中文领域内finetune bert模型
☆24Apr 15, 2020Updated 6 years ago
Mountchicken / Union14M
View on GitHub
[ICCV 2023] Code base for Revisiting Scene Text Recognition: A Data Perspective
☆206Nov 1, 2023Updated 2 years ago
DS4SD / MolDepictor
View on GitHub
[ICCV 23] MolGrapher: Graph-based Visual Recognition of Chemical Structures
☆16Oct 27, 2025Updated 8 months ago
machine-intelligence-laboratory / DDI-100
View on GitHub
Distorted Document Images dataset (DDI-100).
☆146Nov 1, 2022Updated 3 years ago
DS4SD / MarkushGenerator
View on GitHub
[CVPR 25] MarkushGrapher: Joint Visual and Textual Recognition of Markush Structures
☆15Mar 22, 2026Updated 3 months ago