通用版面分析 | 中文文档解析 |Document Layout Analysis | layout paser
☆49Jun 13, 2024Updated last year
Alternatives and similar repositories for General-Documents-Layout-parser
Users that are interested in General-Documents-Layout-parser are comparing it to the libraries listed below
Sorting:
- 微调阿里开源的文字检测模型,利用合合识别返回的OCR结果作为初始训练数据,对模型进行优化训练,使其更加适应1万张图片的具体场景,提高文字识别的精度。☆10Dec 9, 2024Updated last year
- 使用Qwen1.5-0.5B-Chat模型进行通用信息抽取任务的微调,旨在: 验证生成式方法相较于抽取式NER的效果; 为新手提供简易的模型微调流程,尽量减少代码量; 大模型训练的数据格式处理。☆15Sep 6, 2024Updated last year
- 表格结构识别LGPMA推理☆25Nov 17, 2022Updated 3 years ago
- 该项目是为了使用layoutlmv3针对中文图片训练和推理。 其中主要解决三个问题: 1.数据标准化成可以的训练数据集格式 2.layoutlmv3-base-chinese 分词修改 2.超过512长度的文本切分和滑窗操作☆63Sep 6, 2024Updated last year
- From Llama to Deepseek, grpo/mtp implemented. With pt/sft/lora/qlora included☆30Apr 21, 2025Updated 10 months ago
- ☆47Jul 19, 2022Updated 3 years ago
- chinese document classification of layoutlmv3 and layoutxlm☆46Oct 25, 2022Updated 3 years ago
- 利用Swin-Unet(Swin Transformer Unet)实现对文档图片里表格结构的识别,Swin-unet (Swin Transformer Unet) is used to identify the document table structure☆28Feb 23, 2024Updated 2 years ago
- ☆157May 8, 2025Updated 10 months ago
- Table Structure Recognition☆28Jul 25, 2024Updated last year
- TianGong-AI-Unstructure☆71Feb 4, 2026Updated last month
- A knowledge base backend system for LLMs with full-text search, semantic retrieval, and knowledge graph querying. Ready-to-use modules fo…☆28Apr 13, 2025Updated 11 months ago
- ☆67Sep 18, 2024Updated last year
- ☆18Feb 5, 2026Updated last month
- 【ArXiv】PDF-Wukong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling☆128Jun 4, 2025Updated 9 months ago
- benchmark of KgCLUE, with different models and methods☆28Dec 13, 2021Updated 4 years ago
- 【间隙·树·排序算法】 对OCR结果或PDF提取的文本进行版面分析,按人类阅读顺序进行排序。☆162Feb 28, 2024Updated 2 years ago
- 文档方向分类☆222Feb 3, 2026Updated last month
- Termius Pro 本地功能破解☆10May 11, 2024Updated last year
- SMP 2023 ChatGLM金融大模型挑战赛 60 分baseline思路介绍☆186Aug 10, 2023Updated 2 years ago
- ☆27Jun 23, 2020Updated 5 years ago
- CDLA: A Chinese document layout analysis (CDLA) dataset☆289Sep 13, 2021Updated 4 years ago
- 方便扩展的Cuda算子理解和优化框架,仅用在学习使用☆18Jun 13, 2024Updated last year
- ☆14Feb 25, 2025Updated last year
- 猛虎汽车故障云诊断系统☆13Dec 12, 2014Updated 11 years ago
- 2023全球智能汽车AI挑战赛——赛道一:AI大模型检索问答, 75+ baseline☆61Dec 7, 2023Updated 2 years ago
- 根据维基百科历史编辑数据提取纠错语料。☆12Apr 6, 2022Updated 3 years ago
- 修正文档扭曲/模糊/阴影等情况,使用onnx模型简单轻量部署,未来持续跟进最新最好的文档矫正方案和模型,Correct document distortion using a lightweight ONNX model for easy deployment. We wi…☆98Dec 17, 2025Updated 3 months ago
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.☆315Aug 15, 2025Updated 7 months ago
- 一个简单快速的分词、命名实体识别工具☆631Sep 26, 2025Updated 5 months ago
- Research project for task-oriented dialogue system with jointly training multi-intent classification and slot filling☆10Sep 11, 2023Updated 2 years ago
- 基于pycorrector以及chatglm3-6b的文本纠错☆12Mar 10, 2024Updated 2 years ago
- ☆24Oct 8, 2021Updated 4 years ago
- ☆11Nov 11, 2022Updated 3 years ago
- 卡证和文档检测和矫正☆82Sep 18, 2024Updated last year
- OCR pre-processing algorithm implementation in C for remove color seal☆17Mar 4, 2019Updated 7 years ago
- 基于rknn的yolov5的cpp实现,包含各种依赖库,是一个完整工程,可直接编译运行☆20Feb 10, 2022Updated 4 years ago
- 中文纠错☆91Mar 7, 2022Updated 4 years ago
- 电子病历结构化解析☆13May 11, 2022Updated 3 years ago