Tools for extract figure, table, text, .. from a pdf document.
☆33Nov 25, 2020Updated 5 years ago
Alternatives and similar repositories for Document-Layout-Analysis
Users that are interested in Document-Layout-Analysis are comparing it to the libraries listed below
Sorting:
- A step-by-step C# implementation of the Docstrum algorithm☆24Dec 13, 2020Updated 5 years ago
- AI_DocumentLayoutAnalysis☆39Nov 25, 2020Updated 5 years ago
- Document Layout Analysis resources repos for development with PdfPig.☆633Oct 1, 2023Updated 2 years ago
- Implementation code for document layout analysis (Hackathon 2020 in Suzhou)☆81Feb 27, 2020Updated 6 years ago
- DL models that take a document image file as input, locate the position of paragraphs, lines, images, etc. with their labels and confiden…☆26Dec 31, 2020Updated 5 years ago
- Document Layout Analysis Projects☆23Sep 4, 2019Updated 6 years ago
- PAGE XML format collection for document image page content and more☆70Jan 16, 2026Updated 2 months ago
- ☆18May 30, 2023Updated 2 years ago
- OCR as a service☆15Dec 11, 2016Updated 9 years ago
- This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…☆45May 25, 2021Updated 4 years ago
- Mehraban Book Pahlavi typeface by Amir Mahdi Moslehi☆10Mar 2, 2026Updated 2 weeks ago
- DocBank: A Benchmark Dataset for Document Layout Analysis☆639Aug 12, 2024Updated last year
- Kompakkt - the web based 3D viewer and 3D annotation system.☆17Jan 27, 2026Updated last month
- Research project on the state of the field of Multilingual Digital Humanities, with an initial focus on Arabic☆13Mar 1, 2026Updated 2 weeks ago
- Rooted in calligraphy, Gotu is a modulated display typeface in Devanagari and Latin, with large loops and voluminous counters.☆11Jan 10, 2020Updated 6 years ago
- Table Recognition and Content Extraction in PDF Files☆23Apr 22, 2019Updated 6 years ago
- Detectron2 for Document Layout Analysis☆187Aug 2, 2024Updated last year
- Layout analysis to find layout elements in documents (similar to P2PaLA)☆20Feb 27, 2026Updated 3 weeks ago
- ICDAR 2019: MaskRCNN on PubLayNet datasets. Paragraph detection, table detection, figure detection,...☆183May 11, 2021Updated 4 years ago
- Unicode case mapping and character class data for use by TeX☆19Nov 24, 2025Updated 3 months ago
- Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset☆29Apr 16, 2023Updated 2 years ago
- Code and data for Teddy https://arxiv.org/abs/2001.05171.☆15Jun 21, 2022Updated 3 years ago
- ☆14Jul 7, 2021Updated 4 years ago
- A set of utilities/wrapper for Test Automation or Performance testing on top of Chrome DevTools Protocol☆12Feb 18, 2024Updated 2 years ago
- Using LLM to summarize and enhance media consumption experience.☆17Oct 23, 2023Updated 2 years ago
- Code for ICPR2022 paper: "Graph Neural Networks and Representation Embedding for table extraction in PDF Documents"☆37Jul 13, 2023Updated 2 years ago
- NLP Web API for Legal Text☆18Dec 23, 2022Updated 3 years ago
- Collection of python scripts to demonstrate asynchronous programming in python☆11May 22, 2022Updated 3 years ago
- CDLA: A Chinese document layout analysis (CDLA) dataset☆289Sep 13, 2021Updated 4 years ago
- An interactive grid for sorting, filtering, and editing DataFrames in Jupyter notebooks☆11Mar 15, 2022Updated 4 years ago
- ☆71Apr 3, 2018Updated 7 years ago
- Dùng scrapy-splash kết hợp lua script để crawl các trang web sử dụng Javascript (websosanh)☆16Dec 8, 2022Updated 3 years ago
- convert PubLayNet data into METS/PAGE-XML☆10Mar 17, 2020Updated 6 years ago
- A simplified port of LayoutParser for detecting layout elements on documents.☆13Jun 3, 2024Updated last year
- PDF Extraction Toolkit (wraps and trains LayoutLM)☆10Oct 8, 2021Updated 4 years ago
- 4th Place solution for the Kaggle CommonLit Readability Prize☆38Aug 12, 2021Updated 4 years ago
- ☆11Feb 11, 2025Updated last year
- Modal LLM LLama.cpp based model deployment as part of series of Model as a Service (MaaS)☆17Feb 27, 2026Updated 3 weeks ago
- A frozen version of angr for the SAILR paper☆16Sep 4, 2024Updated last year