dpapathanasiou/pdfminer-layout-scanner

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/dpapathanasiou/pdfminer-layout-scanner)

dpapathanasiou / pdfminer-layout-scanner

A more complete example of programming with PDFMiner, which continues where the default documentation stops

☆216

Alternatives and similar repositories for pdfminer-layout-scanner

Users that are interested in pdfminer-layout-scanner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

euske / pdfminer
View on GitHub
Python PDF Parser (Not actively maintained). Check out pdfminer.six.
☆5,281Dec 7, 2022Updated 3 years ago
syllabs / pdf2text
View on GitHub
A PDFMiner wrapper to ease the text extraction from pdf files.
☆24Apr 25, 2013Updated 13 years ago
pdfminer / pdfminer.six
View on GitHub
Community maintained fork of pdfminer - we fathom PDF
☆7,005Mar 13, 2026Updated 4 months ago
ppasupat / web-entity-extractor-ACL2014
View on GitHub
☆13Jun 14, 2016Updated 10 years ago
ijmbarr / parsing-pdfs
View on GitHub
Extracting tabular information from PDFs using python
☆43Apr 4, 2019Updated 7 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
twofactor / MenubarNotes
View on GitHub
Small Notes App for OSX Menubar
☆13Oct 24, 2016Updated 9 years ago
jcushman / pdfquery
View on GitHub
A fast and friendly PDF scraping library.
☆781Oct 17, 2023Updated 2 years ago
maplight / CAPS
View on GitHub
CAL-ACCESS Campaign Power Search
☆12Nov 2, 2017Updated 8 years ago
openelections / openelections-sources-tx
View on GitHub
Unprocessed results files from Texas
☆10Sep 8, 2025Updated 10 months ago
felipeochoa / minecart
View on GitHub
Simple, Pythonic extraction of text, shapes and images from PDFs
☆80Jun 4, 2020Updated 6 years ago
elacin / PDFExtract
View on GitHub
my take at a PDF text extraction utility
☆15Jun 15, 2015Updated 11 years ago
strin / cbt-model
View on GitHub
algorithms for solving the Children's Book Test (CBT)
☆10Jun 8, 2016Updated 10 years ago
kevinweber / inline-comments
View on GitHub
Inline Comments adds your comment system to the side of paragraphs and other sections of your post. WordPress plugin.
☆31Apr 14, 2018Updated 8 years ago
WZBSocialScienceCenter / pdftabextract
View on GitHub
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
☆2,255Jun 24, 2022Updated 4 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
alephdata / pdflib
View on GitHub
Binary Python bindings for poppler utils for content extraction
☆42May 12, 2021Updated 5 years ago
18F / linkify-citations
View on GitHub
Turns legal citations in the DOM into links
☆20Mar 15, 2017Updated 9 years ago
altoxml / documentation
View on GitHub
Documentation and use cases for ALTO XML
☆42Sep 10, 2018Updated 7 years ago
BlackHolePerturbationToolkit / GeneralRelativityTensors
View on GitHub
Provides a set of functions for performing coordinate-based tensor calculations with a focus on general relativity and black holes in par…
☆11Jan 20, 2021Updated 5 years ago
duetosymmetry / simple-slow-rot-NS-solver
View on GitHub
Simple slow-rotation neutron star structure solver
☆12Nov 12, 2020Updated 5 years ago
sohamkamani / d3-force-gravity
View on GitHub
Implement gravitational attraction (or force-field-like repulsion) using d3-force
☆22Aug 12, 2016Updated 9 years ago
Robpol86 / etaprogress
View on GitHub
Easy to use ETA calculation and progress bar library.
☆14Jun 2, 2015Updated 11 years ago
harshilpatel312 / open-images-downloader
View on GitHub
Download specific objects from Open-Images Dataset
☆37Aug 3, 2018Updated 7 years ago
jonschlinkert / parse-csv
View on GitHub
CSV parser for node.js
☆15Mar 11, 2019Updated 7 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Juris-M / jurism-abbreviations
View on GitHub
Abbreviations for use with the Abbreviation Filter developed for use with Multilingual Zotero.
☆18Nov 8, 2023Updated 2 years ago
timClicks / slate
View on GitHub
The simplest way to extract text from PDFs in Python
☆427Jul 7, 2022Updated 4 years ago
turicas / templater
View on GitHub
Extract, parse and populate templates from strings
☆28Apr 4, 2019Updated 7 years ago
py-pdf / pypdf
View on GitHub
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
☆10,127Updated this week
ashima / pdf-table-extract
View on GitHub
Extract tables from PDF pages.
☆300Jun 25, 2020Updated 6 years ago
hrbrmstr / xslt
View on GitHub
lightweight XSLT processing package for R based on xmlwrapp
☆22Mar 7, 2017Updated 9 years ago
pmaupin / pdfrw
View on GitHub
pdfrw is a pure Python library that reads and writes PDFs
☆1,909Apr 29, 2024Updated 2 years ago
alexpreynolds / soda
View on GitHub
Python-based UCSC genome browser snapshot-taker and gallery-maker
☆12May 21, 2024Updated 2 years ago
gamallo / DepPattern
View on GitHub
Dependency Syntactic Parsing for Portuguese, Spanish, English, and Galician, including MetaRomance parser
☆10Jun 7, 2018Updated 8 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
HazyResearch / fonduer
View on GitHub
A knowledge base construction engine for richly formatted data
☆412Jun 23, 2021Updated 5 years ago
nstringham / othello-web-app
View on GitHub
☆13Jul 3, 2026Updated 2 weeks ago
danielecook / tut
View on GitHub
A collection of CSV/TSV Utilities
☆13Jun 2, 2020Updated 6 years ago
strin / mocha-gemm-profile
View on GitHub
profiling gemm on android
☆10Apr 1, 2016Updated 10 years ago
newsdev / nyt-pyfec
View on GitHub
A Python library for downloading, parsing and cleaning Federal Election Commission filings.
☆28Jan 30, 2024Updated 2 years ago
JoshData / pdf-diff
View on GitHub
A PDF comparison utility in Python.
☆522Feb 8, 2026Updated 5 months ago
mattandahalfew / Levenshtein_search
View on GitHub
Python search module for fast approximate string matching
☆54Jan 25, 2023Updated 3 years ago