A step-by-step C# implementation of the Docstrum algorithm
☆24Dec 13, 2020Updated 5 years ago
Alternatives and similar repositories for simple-docstrum
Users that are interested in simple-docstrum are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Tools for extract figure, table, text, .. from a pdf document.☆35Nov 25, 2020Updated 5 years ago
- ☆71Apr 3, 2018Updated 8 years ago
- Document Layout Analysis resources repos for development with PdfPig.☆635Oct 1, 2023Updated 2 years ago
- Document Layout Analysis Projects☆23Sep 4, 2019Updated 6 years ago
- PAGE XML format collection for document image page content and more☆72Jan 16, 2026Updated 4 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Simple docker deployment of document layout analysis using detectron2☆19Nov 7, 2021Updated 4 years ago
- BoundaryNet - A Semi-Automatic Layout Annotation Tool☆24Dec 11, 2021Updated 4 years ago
- Implementation of BertGrid : https://arxiv.org/abs/1909.04948☆30Apr 10, 2024Updated 2 years ago
- Transkriptionen von Fibeln (19. Jahrhundert)☆11Oct 31, 2025Updated 7 months ago
- This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified an…☆23Sep 11, 2020Updated 5 years ago
- A C# library to extract tabular data from PDFs (port of camelot Python version using PdfPig).☆35Feb 4, 2022Updated 4 years ago
- Extract tables from PDF files (port of tabula-java)☆211May 4, 2026Updated last month
- NLP system for identifying patient housing status in Veteran Affairs☆11Feb 18, 2024Updated 2 years ago
- RUN LENGTH SMOOTHING ALGORITHM(RLSA) is a method mainly used for block segmentation and text discrimination. It helps to extract the nece…☆24Jun 21, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- OCR-D post-correction with encoder-attention-decoder LSTMs☆13May 1, 2025Updated last year
- ICDAR 2021 Competition on Scientific Literature Parsing☆35Aug 20, 2020Updated 5 years ago
- SLUB Document Classification and Similarity Analysis☆10Aug 31, 2023Updated 2 years ago
- GloSAT Historical Measurement Table Dataset☆11Dec 3, 2025Updated 6 months ago
- Specification of the @OCR-D technical architecture, interface definitions and data exchange format(s)☆17Sep 18, 2025Updated 8 months ago
- convert qqwweee/keras-yolo3 h5 file to tensorflow pb file☆10Jul 17, 2020Updated 5 years ago
- Grobid module for superconductor material and properties extraction☆23May 17, 2025Updated last year
- OCR-D post-correction module based on weighted finite-state transducers☆11Jan 13, 2024Updated 2 years ago
- Deep learning based page layout analysis☆197Apr 24, 2019Updated 7 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- OCR-D compliant toolset for optical layout recognition on historical german-language documents published in Brazil☆11Sep 24, 2021Updated 4 years ago
- Rust bindings for the Ghostscript PS/PDF interpreter library☆10Jan 14, 2018Updated 8 years ago
- DEPRECATED: Use https://github.com/18F/gapps-download instead☆10Oct 27, 2015Updated 10 years ago
- Etalab's Lab IA Pseudonymization Demo source code☆11Aug 3, 2023Updated 2 years ago
- A workflow system for Natural Language Processing.☆21Oct 17, 2019Updated 6 years ago
- Digital Contracting Cookbook☆10Mar 9, 2016Updated 10 years ago
- A simple document layout analysis using Python-OpenCV☆127Aug 11, 2020Updated 5 years ago
- Repository to use/train segmentation models for document layout analysis☆19Jan 13, 2022Updated 4 years ago
- Rust library for extracting data from HTML tables.☆13Mar 4, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Progressively enhance your HTML with dynamic data☆13May 1, 2018Updated 8 years ago
- A Trello webhook server☆10May 18, 2016Updated 10 years ago
- Dokku buildpack for GitLab☆22Apr 5, 2015Updated 11 years ago
- RUN LENGTH SMOOTHING ALGORITHM(RLSA) is a method mainly used for block segmentation and text discrimination. It helps to extract the nece…☆29Nov 5, 2023Updated 2 years ago
- A complete agency API program.☆12Apr 27, 2017Updated 9 years ago
- Recognize text using Calamari OCR and the OCR-D framework☆16May 13, 2025Updated last year
- An interactive grid for sorting, filtering, and editing DataFrames in Jupyter notebooks☆11Mar 15, 2022Updated 4 years ago