opendatalab/DocLayout-YOLO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/opendatalab/DocLayout-YOLO)

opendatalab / DocLayout-YOLO

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception

☆2,233

Alternatives and similar repositories for DocLayout-YOLO

Users that are interested in DocLayout-YOLO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

opendatalab / PDF-Extract-Kit
View on GitHub
A Comprehensive Toolkit for High-Quality PDF Content Extraction
☆9,797Jan 3, 2025Updated last year
FreeOCR-AI / layoutreader
View on GitHub
A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.
☆322Aug 15, 2025Updated 11 months ago
opendatalab / UniMERNet
View on GitHub
UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition
☆492Sep 28, 2025Updated 9 months ago
opendatalab / OmniDocBench
View on GitHub
[CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation
☆1,900Jun 26, 2026Updated 3 weeks ago
Ucas-HaoranWei / GOT-OCR2.0
View on GitHub
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
☆8,155Feb 10, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
RapidAI / RapidLayout
View on GitHub
Analysis of Chinese and English layouts 中英文版面分析
☆275Mar 24, 2026Updated 3 months ago
AlibabaResearch / AdvancedLiterateMachinery
View on GitHub
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team…
☆1,833Mar 17, 2026Updated 4 months ago
RapidAI / TableStructureRec
View on GitHub
整理目前开源的最优表格识别模型，完善前后处理，模型转换为ONNX | Organize the currently open-source optimal table recognition models, improve pre-processing and post-…
☆954Aug 3, 2025Updated 11 months ago
InternScience / StructEqTable-Deploy
View on GitHub
A High-efficiency Open-source Toolkit for Table-to-Latex Task
☆276Dec 6, 2025Updated 7 months ago
RapidAI / RapidTable
View on GitHub
基于序列表格识别算法推理库，集成PP-Structure和modelscope等表格识别算法。
☆433Apr 23, 2026Updated 2 months ago
studio-dots-ai / dots.ocr
View on GitHub
Multilingual Document Layout Parsing in a Single Vision-Language Model
☆9,016Mar 24, 2026Updated 3 months ago
opendatalab / MinerU
View on GitHub
Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.
☆75,311Updated this week
poloclub / unitable
View on GitHub
UniTable: Towards a Unified Table Foundation Model
☆533Apr 21, 2026Updated 3 months ago
bytedance / Dolphin
View on GitHub
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
☆9,037Mar 25, 2026Updated 3 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
datalab-to / surya
View on GitHub
OCR, layout analysis, reading order, table recognition in 90+ languages
☆21,130Updated this week
allenai / olmocr
View on GitHub
Toolkit for linearizing PDFs for LLM datasets/training
☆19,151Mar 25, 2026Updated 3 months ago
Yuliang-Liu / MonkeyOCR
View on GitHub
A lightweight LMM-based Document Parsing Model
☆6,605Updated this week
microsoft / CompHRDoc
View on GitHub
Datasets and Evaluation Scripts for CompHRDoc
☆59Feb 25, 2025Updated last year
huridocs / pdf-document-layout-analysis
View on GitHub
A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The servic…
☆1,251Jul 13, 2026Updated last week
RapidAI / RapidOCR
View on GitHub
📄 Awesome OCR multiple programing languages toolkits based on ONNX Runtime, OpenVINO, MNN, PaddlePaddle, TensorRT and PyTorch.
☆7,225Updated this week
HCIILAB / M6Doc
View on GitHub
☆163May 8, 2025Updated last year
X-PLUG / mPLUG-DocOwl
View on GitHub
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
☆2,408May 30, 2025Updated last year
FreeOCR-AI / yolo-doclaynet
View on GitHub
YOLO models trained by DocLayNet - power your Document Intelligent by Layout Analysis
☆158Mar 10, 2026Updated 4 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
NanoNets / docext
View on GitHub
An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)
☆2,032Mar 17, 2026Updated 4 months ago
microsoft / table-transformer
View on GitHub
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the o…
☆2,930Jun 24, 2024Updated 2 years ago
Tencent / POINTS-Reader
View on GitHub
☆197Dec 7, 2025Updated 7 months ago
PaddlePaddle / PaddleOCR
View on GitHub
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/…
☆85,960Jul 15, 2026Updated last week
Layout-Parser / layout-parser
View on GitHub
A Unified Toolkit for Deep Learning Based Document Image Analysis
☆5,764Aug 15, 2024Updated last year
360AILAB-NLP / 360LayoutAnalysis
View on GitHub
360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute
☆305Sep 10, 2024Updated last year
clovaai / donut
View on GitHub
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
☆6,906Jul 11, 2024Updated 2 years ago
datalab-to / marker
View on GitHub
Convert PDF to markdown + JSON quickly with high accuracy
☆37,711Updated this week
docling-project / docling
View on GitHub
Get your documents ready for gen AI
☆63,561Updated this week
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
DS4SD / DocLayNet
View on GitHub
DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis
☆446Feb 1, 2023Updated 3 years ago
buptlihang / CDLA
View on GitHub
CDLA: A Chinese document layout analysis (CDLA) dataset
☆293Sep 13, 2021Updated 4 years ago
QwenLM / Qwen3-VL
View on GitHub
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
☆19,637Jan 30, 2026Updated 5 months ago
ZZZHANG-jx / Recommendations-Document-Image-Processing
View on GitHub
This repository contains a paper collection of the methods for document image processing, including appearance enhancement, deshadowing, …
☆394Jun 1, 2026Updated last month
mindee / doctr
View on GitHub
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning. Ongo…
☆6,190Updated this week
facebookresearch / nougat
View on GitHub
Implementation of Nougat Neural Optical Understanding for Academic Documents
☆10,046Feb 21, 2025Updated last year
doc-analysis / ReadingBank
View on GitHub
ReadingBank: A Benchmark Dataset for Reading Order Detection
☆117Aug 26, 2024Updated last year