Veason-silverbullet/ViTLP

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Veason-silverbullet/ViTLP)

Veason-silverbullet / ViTLP

[NAACL 2024] Visually Guided Generative Text-Layout Pre-training for Document Intelligence

☆149

Alternatives and similar repositories for ViTLP

Users that are interested in ViTLP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

johnning2333 / M2Doc
View on GitHub
☆43Jun 15, 2024Updated 2 years ago
JierunChen / SFT-RL-SynergyDilemma
View on GitHub
☆15Jan 14, 2026Updated 6 months ago
LayTextLLM / LayTextLLM
View on GitHub
☆103Dec 23, 2024Updated last year
LukeForeverYoung / UReader
View on GitHub
☆142Feb 13, 2024Updated 2 years ago
AlibabaResearch / AdvancedLiterateMachinery
View on GitHub
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team…
☆1,833Mar 17, 2026Updated 4 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
nttmdlab-nlp / InstructDoc
View on GitHub
InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions (AAAI2024)
☆162May 31, 2024Updated 2 years ago
WenmuZhou / TableGeneration
View on GitHub
通过浏览器渲染生成表格图像
☆238Apr 10, 2024Updated 2 years ago
AILab-UniFI / cte-dataset
View on GitHub
CTE: Contextualized Table Extraction Dataset
☆17Feb 23, 2023Updated 3 years ago
X-PLUG / mPLUG-DocOwl
View on GitHub
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
☆2,410May 30, 2025Updated last year
xhli-git / DocSAM
View on GitHub
☆33Apr 8, 2025Updated last year
irisXcoding / DocReal
View on GitHub
DocReal: Robust Document Dewarping of Real-Life Images via Attention-Enhanced Control Point Prediction
☆30Jun 28, 2023Updated 3 years ago
SpursGoZmy / Table-LLaVA
View on GitHub
Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and Pre-Train …
☆227Jun 12, 2025Updated last year
MaxKinny / TabRecSet
View on GitHub
A large scale camera-taken table detection and recognition dataset.
☆151Apr 9, 2026Updated 3 months ago
LB-Young / Bambo
View on GitHub
Bambo is a new proxy framework. Compared with mainstream frameworks, it is more lightweight and flexible and can handle various load task…
☆34Feb 10, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Chunchunwumu / SEMv3
View on GitHub
The official PyTorch implementation of SEMv3.
☆53May 26, 2024Updated 2 years ago
DocTron-hub / Chart-R1
View on GitHub
Chart-R1: Chain-of-Thought Supervision and Reinforcement for Advanced Chart Reasoner
☆24Aug 7, 2025Updated 11 months ago
360AILAB-NLP / 360LayoutAnalysis
View on GitHub
360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute
☆305Sep 10, 2024Updated last year
opendatalab / DocLayout-YOLO
View on GitHub
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
☆2,235Apr 14, 2025Updated last year
RapidAI / TableStructureRec
View on GitHub
整理目前开源的最优表格识别模型，完善前后处理，模型转换为ONNX | Organize the currently open-source optimal table recognition models, improve pre-processing and post-…
☆955Aug 3, 2025Updated 11 months ago
jpWang / LiLT
View on GitHub
Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understan…
☆366Oct 31, 2022Updated 3 years ago
FreeOCR-AI / layoutreader
View on GitHub
A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.
☆322Aug 15, 2025Updated 11 months ago
harrytea / Awesome-Document-Understanding
View on GitHub
Document Artifical Intelligence
☆201Sep 28, 2025Updated 9 months ago
yujunhuics / LayoutReader
View on GitHub
阅读顺序、Layoutreader
☆18May 8, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
buptlihang / CDLA
View on GitHub
CDLA: A Chinese document layout analysis (CDLA) dataset
☆294Sep 13, 2021Updated 4 years ago
Line-Kite / GraphLayoutLM
View on GitHub
☆14Sep 6, 2024Updated last year
namtuanly / MTL-TabNet
View on GitHub
MTL-TabNet: Multi-task Learning based Model for Image-based Table Recognition
☆103May 30, 2024Updated 2 years ago
HCIILAB / M6Doc
View on GitHub
☆164May 8, 2025Updated last year
chongzhangFDU / ROOR
View on GitHub
This is the official implementation to the EMNLP 2024 paper: Modeling Layout Reading Order as Ordering Relations for Visually-rich Docume…
☆32Jan 19, 2026Updated 6 months ago
Freeky7819 / DragonMemory
View on GitHub
Neural Memory Compression System for RAG Applications
☆20Nov 20, 2025Updated 8 months ago
ZZR8066 / SEMv2
View on GitHub
☆71Jun 26, 2024Updated 2 years ago
rossumai / docile
View on GitHub
DocILE: Document Information Localization and Extraction Benchmark
☆149Jun 17, 2026Updated last month
Ucas-HaoranWei / GOT-OCR2.0
View on GitHub
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
☆8,158Feb 10, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
ispamm / NAF-DPM
View on GitHub
NAF-DPM: A Nonlinear Activation-Free Diffusion Probabilistic Model for Document Enhancement
☆54Aug 5, 2024Updated last year
IBM / KVP10k
View on GitHub
Repository for the KVP10k dataset
☆23Sep 18, 2025Updated 10 months ago
IITB-LEAP-OCR / SPRINT
View on GitHub
SPRINT: Script-agnostic Structure Recognition in Tables
☆17Mar 26, 2025Updated last year
Tencent / POINTS-Reader
View on GitHub
☆197Dec 7, 2025Updated 7 months ago
tianchiguaixia / layoutlmv3-chinese
View on GitHub
该项目是为了使用layoutlmv3针对中文图片训练和推理。其中主要解决三个问题： 1.数据标准化成可以的训练数据集格式 2.layoutlmv3-base-chinese 分词修改 2.超过512长度的文本切分和滑窗操作
☆64Sep 6, 2024Updated last year
moured / YOLOv10-Document-Layout-Analysis
View on GitHub
YOLOv10 trained on DocLayNet dataset.
☆82Nov 1, 2024Updated last year
clovaai / webvicob
View on GitHub
Official Implementation of Web-based Visual Corpus Builder (Webvicob), ICDAR 2023
☆110Oct 24, 2023Updated 2 years ago