huridocs/pdf-text-extraction

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/huridocs/pdf-text-extraction)

huridocs / pdf-text-extraction

This project aims to extract text from PDF files using the outputs generated by the pdf-document-layout-analysis service. By leveraging the segmentation and classification capabilities of the underlying analysis tool, this project automates the process of text extraction from PDF files.

☆39

Alternatives and similar repositories for pdf-text-extraction

Users that are interested in pdf-text-extraction are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

irthomasthomas / llm-plugin-generator
View on GitHub
LLM plugin to generate plugins for LLM
☆13Dec 30, 2024Updated last year
SymbolixAU / mapbox
View on GitHub
☆11Sep 19, 2019Updated 6 years ago
ropensci / BaseSet
View on GitHub
Provides classes for working with sets
☆11Dec 19, 2025Updated 6 months ago
elder-plinius / Gitty
View on GitHub
☆25Feb 15, 2024Updated 2 years ago
xlr8harder / llm-compliance
View on GitHub
☆23Jun 20, 2026Updated last week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
DS4PS / ds4ps.github.io
View on GitHub
Umbrella website for Data Science for the Public Sector
☆12Aug 2, 2024Updated last year
shiftcmdk / BookmarkIcons
View on GitHub
Set custom icons for Safari bookmarks on iOS
☆11Feb 22, 2020Updated 6 years ago
minimaxir / imdb-data-analysis
View on GitHub
R Code + R Notebook on how to process and visualize the official IMDb datasets.
☆12Jul 16, 2018Updated 7 years ago
inket / AutomatedBrowser
View on GitHub
A Swift package for interacting with selenium and undetected-chromedriver through python by using PythonKit.
☆13Jun 21, 2025Updated last year
davidsjoberg / simplecountries
View on GitHub
Convert alternative country name to simple country names
☆11Jun 3, 2020Updated 6 years ago
thomasyung / GDocs-File-Previewer
View on GitHub
This Elgg plugin lets users preview MS Office files (doc, docx, xls, xlsx, ppt, pptx), Apple iWork pages, Adobe eps, and zip files using …
☆12Aug 28, 2015Updated 10 years ago
Infineon / PSoC4-MCU-Pioneer-Kits
View on GitHub
This repository contains getting started projects related to all PSoC4 pioneer kits.
☆14Oct 30, 2018Updated 7 years ago
globalgov / messydates
View on GitHub
R package for Extended Date/Time Format (EDTF)
☆16Jun 2, 2025Updated last year
weaviate / elysia-frontend
View on GitHub
Frontend Repository for Elysia
☆180Feb 6, 2026Updated 4 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
freddyaboulton / gradio-agentchatbot
View on GitHub
Chat with agents 🤖 and see their thoughts 💭
☆15Jul 8, 2024Updated last year
mattcox / Tree
View on GitHub
A hierarchical tree structure for Swift
☆18Dec 28, 2024Updated last year
andreweatherman / gtUtils
View on GitHub
Enhancements and Utilities for the gt Package
☆20Sep 15, 2024Updated last year
King-Bong-Software / postwoman
View on GitHub
PostWoman 💅 is a lightweight Postman alternative designed specifically for macOS
☆32Jan 16, 2026Updated 5 months ago
api2r / nectar
View on GitHub
A Framework for Web API Packages
☆16Jun 7, 2026Updated 3 weeks ago
yantoz / FinderEx
View on GitHub
MacOS Finder Sync Extension to Allow Adding Custom Actions
☆14Feb 15, 2022Updated 4 years ago
crazycapivara / maplibre-gl-r
View on GitHub
An R Interface to maplibre-gl-js
☆13Nov 24, 2022Updated 3 years ago
richard-gyiko / json-schema-to-pydantic
View on GitHub
☆44Mar 9, 2026Updated 3 months ago
ngandhi369 / AI-Email-Classifier
View on GitHub
Flask web app made using machine learning model. It uses mails from authorized user's Gmail and shows mails with categorical label on web…
☆13May 23, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
gtokman / cigarette
View on GitHub
A simple Safari extension for macOS & iOS that removes ads from X, Reddit, and LinkedIn.
☆12Apr 13, 2025Updated last year
inbo / checklist
View on GitHub
An R package for checking R packages and R code
☆21Jun 24, 2026Updated last week
elipousson / sfext
View on GitHub
✂️🌐 A R package with extra options for simple features and spatial data
☆20Apr 30, 2026Updated 2 months ago
DawnLiExp / Me2Comic
View on GitHub
A macOS GUI tool that calls GraphicsMagick to batch-edit images.
☆12Apr 28, 2026Updated 2 months ago
Marvis-Labs / marvis-tts-swift
View on GitHub
A Swift version of Marvis TTS, running locally on Apple Silicon using MLX Swift.
☆23Jan 4, 2026Updated 5 months ago
sachink1729 / DSPy-Chain-of-Thought-RAG
View on GitHub
Building a Chain of Thought RAG Model with DSPy, Qdrant and Ollama
☆36Mar 22, 2024Updated 2 years ago
jordibruin / WhisperKit
View on GitHub
Robust speech recognition on-device with CoreML and Swift for iOS and macOS applications.
☆12Feb 21, 2024Updated 2 years ago
Toowiredd / claude-skills-automation
View on GitHub
Fully automated memory and context management for Claude Code using hooks - Zero friction, zero context loss
☆31Oct 22, 2025Updated 8 months ago
Boot-Error / highlight_butler
View on GitHub
Highlight Butler manages highlights in my highlight library
☆11Jul 6, 2021Updated 4 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
Fusyong / ai-proofread-vscode-extension
View on GitHub
A VS Code extension for document and book proofreading based on LLM services
☆22Jun 21, 2026Updated last week
justind000 / nRF-IoT
View on GitHub
RF24 based sensor-mesh (flood, addressless) network
☆26Jan 16, 2014Updated 12 years ago
Vision-CAIR / dochaystacks
View on GitHub
Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents, CVPR 2025
☆26Jan 25, 2025Updated last year
nuance-dev / impulso
View on GitHub
Yet another task manager app
☆21Nov 28, 2024Updated last year
unlp-workshop / unlp-2025-shared-task
View on GitHub
UNLP 2025 Shared Task on Detecting Social Media Manipulation
☆23Aug 4, 2025Updated 10 months ago
Sombre-Osmoze / LibraryGenesis
View on GitHub
A multi App to download file from LibGen.io
☆12Aug 5, 2019Updated 6 years ago
benbalter / gmail-and-google-calendar-stats
View on GitHub
Scrapes your GMail and Google Calendar data and returns it as a CSV for further analysis.
☆20Jun 12, 2023Updated 3 years ago