tianchiguaixia/layoutlmv3-chinese

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/tianchiguaixia/layoutlmv3-chinese)

tianchiguaixia / layoutlmv3-chinese

该项目是为了使用layoutlmv3针对中文图片训练和推理。其中主要解决三个问题： 1.数据标准化成可以的训练数据集格式 2.layoutlmv3-base-chinese 分词修改 2.超过512长度的文本切分和滑窗操作

☆64

Alternatives and similar repositories for layoutlmv3-chinese

Users that are interested in layoutlmv3-chinese are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yongzhuo / layoutlmv3-layoutxlm-chinese
View on GitHub
chinese document classification of layoutlmv3 and layoutxlm
☆45Oct 25, 2022Updated 3 years ago
ZeningLin / PEneo
View on GitHub
[MM'2024] PEneo, an effective algorithm for key-value pair extraction from form-like documents, designed for real-world applications.
☆41Apr 7, 2025Updated last year
manikanthp / LayoutLMV3_Fine_Tuning
View on GitHub
☆69Sep 24, 2023Updated 2 years ago
seanzhang-zhichen / Qwen-WisdomVast
View on GitHub
Qwen-WisdomVast is a large model trained on 1 million high-quality Chinese multi-turn SFT data, 200,000 English multi-turn SFT data, and …
☆17Apr 12, 2024Updated 2 years ago
yujunhuics / LayoutReader
View on GitHub
阅读顺序、Layoutreader
☆18May 8, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Prakhar-97 / Table-detection-and-Document-layout-analysis
View on GitHub
☆10Jun 22, 2020Updated 6 years ago
TenMilesLotus / DTSM
View on GitHub
Code and data for the paper: DTSM: Toward Dense Table Structure Recognition with Text Query Encoder and Adjacent Feature Aggregator
☆13Apr 28, 2024Updated 2 years ago
360AILAB-NLP / 360LayoutAnalysis
View on GitHub
360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute
☆305Sep 10, 2024Updated last year
SWHL / LGPMA_Infer
View on GitHub
表格结构识别LGPMA推理
☆25Nov 17, 2022Updated 3 years ago
johnning2333 / M2Doc
View on GitHub
☆43Jun 15, 2024Updated 2 years ago
OKC13 / General-Documents-Layout-parser
View on GitHub
通用版面分析 | 中文文档解析 |Document Layout Analysis | layout paser
☆47Jun 13, 2024Updated 2 years ago
buptlihang / CDLA
View on GitHub
CDLA: A Chinese document layout analysis (CDLA) dataset
☆294Sep 13, 2021Updated 4 years ago
ZZR8066 / SEM
View on GitHub
☆19Mar 10, 2023Updated 3 years ago
JG1VPP / MuTabNet
View on GitHub
ICDAR 2024/2026 Table OCR Model
☆39Jun 16, 2026Updated last month
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Sanster / OhMyTable
View on GitHub
Table Structure Recognition
☆28Jul 25, 2024Updated 2 years ago
MaitySubhajit / SelfDocSeg
View on GitHub
[ICDAR 2023] SelfDocSeg: A self-supervised vision-based approach towards Document Segmentation (Oral)
☆43Oct 6, 2023Updated 2 years ago
naver-ai / trace
View on GitHub
TRACE: Table Reconstruction Aligned to Corner and Edges (ICDAR 2023)
☆32Mar 13, 2024Updated 2 years ago
MAEHCM / AET
View on GitHub
Code for AAAI 2023 Paper : “Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models”
☆18Dec 6, 2022Updated 3 years ago
tianchiguaixia / qwen1.5-ner
View on GitHub
使用Qwen1.5-0.5B-Chat模型进行通用信息抽取任务的微调，旨在：验证生成式方法相较于抽取式NER的效果；为新手提供简易的模型微调流程，尽量减少代码量；大模型训练的数据格式处理。
☆14Sep 6, 2024Updated last year
GreatV / DocTrPP
View on GitHub
DocTr++ in PaddlePaddle
☆57Jul 24, 2024Updated 2 years ago
wzx99 / CLIPOCR
View on GitHub
☆38Oct 20, 2023Updated 2 years ago
SWHL / TrOCR-Formula-Rec
View on GitHub
基于TrOCR + UniMER-1M数据集，训练一个小而美的公式识别模型
☆30Mar 17, 2026Updated 4 months ago
poloclub / tsr-convstem
View on GitHub
High-Performance Transformers for Table Structure Recognition Need Early Convolutions
☆45Apr 21, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
CaseDrive / publaynet-models
View on GitHub
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
☆28Apr 16, 2023Updated 3 years ago
LayTextLLM / LayTextLLM
View on GitHub
☆103Dec 23, 2024Updated last year
zdyshine / Baidu-netdisk-AI-Image-processing-Challenge-demoire
View on GitHub
百度网盘AI大赛——图像处理挑战赛：文档图像摩尔纹消除第2名方案
☆43Nov 28, 2023Updated 2 years ago
1694439208 / GOT-OCR-Inference
View on GitHub
研究GOT-OCR-项目落地加速，不限语言
☆62Oct 24, 2024Updated last year
HCIILAB / M5HisDoc
View on GitHub
☆34Dec 18, 2025Updated 7 months ago
morning-hao / domain-self-instruct
View on GitHub
受到self-instruct启发,除了通用LLM还能做垂直领域的小LLM实现定制效果，通过GPT获得question和answer来作为训练数据
☆18May 12, 2023Updated 3 years ago
microsoft / CompHRDoc
View on GitHub
Datasets and Evaluation Scripts for CompHRDoc
☆59Feb 25, 2025Updated last year
InternScience / StructEqTable-Deploy
View on GitHub
A High-efficiency Open-source Toolkit for Table-to-Latex Task
☆276Dec 6, 2025Updated 7 months ago
wux-labs / OpenXLab-IntelligentSalesAssistant
View on GitHub
☆19Jun 21, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
xiangly55 / ReMoNet
View on GitHub
code for "ReMoNet: Recurrent Multi-output Network for Efficient Video Denoising" AAAI2022
☆12Mar 19, 2025Updated last year
Jarviswx / tonghuashun_text_matching
View on GitHub
同花顺算法挑战平台：【9-10双月赛】跨领域迁移的文本语义匹配
☆11Oct 28, 2021Updated 4 years ago
RapidAI / RapidLayout
View on GitHub
Analysis of Chinese and English layouts 中英文版面分析
☆275Mar 24, 2026Updated 4 months ago
SCUT-DLVCLab / GPT-4V_OCR
View on GitHub
Evaluation of the Optical Character Recognition (OCR) capabilities of GPT-4V(ision)
☆128Nov 13, 2023Updated 2 years ago
chongzhangFDU / ROOR
View on GitHub
This is the official implementation to the EMNLP 2024 paper: Modeling Layout Reading Order as Ordering Relations for Visually-rich Docume…
☆32Jan 19, 2026Updated 6 months ago
Tan-Junwen / awesome-table-structure-recognition
View on GitHub
A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating…
☆232Sep 9, 2024Updated last year
FreeOCR-AI / layoutreader
View on GitHub
A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.
☆323Aug 15, 2025Updated 11 months ago