MBAigner/PDFSegmenter

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MBAigner/PDFSegmenter)

MBAigner / PDFSegmenter

This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified and returned. Tables are retrieved formatted as a CSV.

☆23

Alternatives and similar repositories for PDFSegmenter

Users that are interested in PDFSegmenter are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

swapnil-ahlawat / Document_Layout_Analysis-MonkAI
View on GitHub
DL models that take a document image file as input, locate the position of paragraphs, lines, images, etc. with their labels and confiden…
☆26Dec 31, 2020Updated 5 years ago
tamirhassan / pdfxtk
View on GitHub
PDF Extraction Toolkit
☆43Nov 23, 2020Updated 5 years ago
woldemarg / borderless_tbls_detection
View on GitHub
☆12Dec 22, 2020Updated 5 years ago
Sreyan88 / DALE
View on GitHub
Code for EMNLP 2023 paper: DALE: Generative Data Augmentation for Low-Resource Legal NLP
☆11Oct 27, 2023Updated 2 years ago
darrow-labs / LegalLens
View on GitHub
☆10Jul 15, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
bhattbhavesh91 / table-detection-streamlit-application
View on GitHub
Google Colab Demo of CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents
☆47Oct 12, 2021Updated 4 years ago
phamquiluan / PubLayNet
View on GitHub
ICDAR 2019: MaskRCNN on PubLayNet datasets. Paragraph detection, table detection, figure detection,...
☆183May 11, 2021Updated 5 years ago
FinanceAndPython / FinanceAndPython.com-BasicFinance
View on GitHub
☆13Jun 21, 2017Updated 9 years ago
maastrichtlawtech / extraction_libraries
View on GitHub
Python libraries for extracting from data sources like Rechtspraak, ECHR, Cellar
☆13Jul 2, 2025Updated last year
VectorInstitute / DANER
View on GitHub
Data Annotation Tool for Named Entity Recognition using Active Learning and Transfer Learning
☆11Aug 20, 2021Updated 4 years ago
msra-nlc / Table2Text
View on GitHub
☆10Apr 16, 2019Updated 7 years ago
MarcusElwin / ner-dspy
View on GitHub
Using DSPy for NER tasks using LLMs
☆17Apr 1, 2024Updated 2 years ago
NilsBarlaug / lemon
View on GitHub
LEMON: Explainable Entity Matching
☆19Apr 6, 2022Updated 4 years ago
hccngu / Meta-SN
View on GitHub
☆11May 23, 2023Updated 3 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
carina-studio / AutoUpdater
View on GitHub
Auto updater for portable application.
☆13Apr 24, 2026Updated 2 months ago
BlackBoiler / legal-nlp-papers
View on GitHub
A repository of legal NLP research papers.
☆13Jan 3, 2020Updated 6 years ago
danistrebel / SemanticGraphQL
View on GitHub
GraphQL for the semantic web
☆12Jan 21, 2016Updated 10 years ago
maastrichtlawtech / law3027-advanced-legal-analytics
View on GitHub
📚 Materials for Advanced Legal Analytics (LAW3027) @ Maastricht University.
☆14May 8, 2024Updated 2 years ago
bungenix / akomantoso-lib
View on GitHub
Java API for the AkomaNtoso XML Schema
☆15Jun 29, 2017Updated 9 years ago
Egolds / Avalonia.Xaml.Interactions.Animated
View on GitHub
Smooth animation support for vertical scrolling in the ScrollViewer.
☆12Jul 11, 2025Updated last year
bertsky / ocrd_publaynet
View on GitHub
convert PubLayNet data into METS/PAGE-XML
☆10Mar 17, 2020Updated 6 years ago
robert-lieck / RBN
View on GitHub
Recursive Bayesian Networks
☆11May 11, 2025Updated last year
Pay20Y / Layout_Analysis
View on GitHub
Document Layout Analysis Projects
☆23Sep 4, 2019Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
simonv3 / annotate
View on GitHub
Image Annotation App for Sandstorm
☆14Nov 8, 2017Updated 8 years ago
jsuarezruiz / AvaloniaSkiaSharpFiddle
View on GitHub
Avalonia SkiaSharp Fiddle is a SkiaSharp playground created with Avalonia and running on macOS, Linux, Windows and WebAssembly.
☆13Mar 7, 2022Updated 4 years ago
bockph / Legal-Sentence-Role-Classification
View on GitHub
This repo is about the classification of rhetorical roles in Legal Documents such as: Citation, Findings of Fact, Evidence, Legal Rule, R…
☆18Feb 22, 2022Updated 4 years ago
kekekeks / example-avalonia-huge-tree
View on GitHub
☆13Oct 16, 2020Updated 5 years ago
nikolamilosevic86 / TabInOut
View on GitHub
Framework for information extraction from tables
☆41Apr 15, 2019Updated 7 years ago
firmao / shaclEditor
View on GitHub
A GUI for edit RDF with SHACL constraints
☆14Sep 26, 2023Updated 2 years ago
DawnEver / mcm-icm-typst-template
View on GitHub
☆11Jan 29, 2026Updated 5 months ago
maastrichtlawtech / case-law-explorer
View on GitHub
☁️ A network analysis software platform for analyzing Dutch and European court decisions.
☆23Mar 31, 2026Updated 3 months ago
danielklecha / SharpIppNext
View on GitHub
A .NET Standard library for building Internet Printing Protocol (IPP) clients and servers.
☆57Updated this week
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
bazmonk / digigurdy-baz
View on GitHub
DigiGurdy Teensy Code
☆22Feb 21, 2024Updated 2 years ago
DM2-ND / SciKG
View on GitHub
Author: Tianwen Jiang (tjiang2@nd.edu). KDD'19. Knowledge graph construction.
☆13Sep 27, 2019Updated 6 years ago
vemonet / nanopub-rs
View on GitHub
✍️ A cross-platform Rust library to sign, publish, and check Nanopublications, with bindings to Python and JS (wasm)
☆23Mar 2, 2026Updated 4 months ago
johnb30 / cliff-docker
View on GitHub
A Docker image for the CLIFF geolocation software.
☆10Jun 12, 2018Updated 8 years ago
Prakhar-97 / Table-detection-and-Document-layout-analysis
View on GitHub
☆10Jun 22, 2020Updated 6 years ago
holoto / ICC-Profiles
View on GitHub
ICC Profiles
☆12Aug 30, 2018Updated 7 years ago
BobLd / DocumentLayoutAnalysis
View on GitHub
Document Layout Analysis resources repos for development with PdfPig.
☆637Oct 1, 2023Updated 2 years ago