huridocs / pdf-text-extraction

This project aims to extract text from PDF files using the outputs generated by the pdf-document-layout-analysis service. By leveraging the segmentation and classification capabilities of the underlying analysis tool, this project automates the process of text extraction from PDF files.
27Updated last month

Alternatives and similar repositories for pdf-text-extraction:

Users that are interested in pdf-text-extraction are comparing it to the libraries listed below