huridocs / pdf-document-layout-analysisLinks
A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The service allows for the segmentation and classification of different parts of PDF pages, identifying the elements such as texts, titles, pictures, tables and so on.
☆612Updated 3 weeks ago
Alternatives and similar repositories for pdf-document-layout-analysis
Users that are interested in pdf-document-layout-analysis are comparing it to the libraries listed below
Sorting:
- DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception☆1,367Updated 2 months ago
- Lightweight, performant, deep table extraction☆478Updated this week
- An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)☆1,220Updated last week
- A High-efficiency Open-source Toolkit for Table-to-Latex Task☆246Updated 6 months ago
- [CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation☆526Updated last month
- 基于序列表格识别算法推理库,集成PP-Structure和modelscope等表格识别算法。☆314Updated this week
- Parse PDFs into markdown using Vision LLMs☆393Updated 4 months ago
- python package to parse pdfs with different parsers☆186Updated 6 months ago
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.☆252Updated 2 weeks ago
- Detect and extract tables to markdown and csv☆749Updated 5 months ago
- DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis☆349Updated 2 years ago
- UniTable: Towards a Unified Table Foundation Model☆482Updated last year