ocrmypdf / OCRmyPDF-EasyOCRLinks
OCRmyPDF EasyOCR plugin
☆86Updated 2 months ago
Alternatives and similar repositories for OCRmyPDF-EasyOCR
Users that are interested in OCRmyPDF-EasyOCR are comparing it to the libraries listed below
Sorting:
- Building scantailor and its dependencies☆58Updated last year
- A post-processing tool for scanned sheets of paper.☆82Updated last year
- A curated list of resources around PDF files☆134Updated 10 months ago
- A wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR …☆65Updated last year
- A free tool to OCR a PDF and add a text "layer" in the original file, making a searchable PDF. Use only open source tools. Please tip!☆290Updated last month
- User contributed (non Google) OCR models for Tesseract☆26Updated 2 months ago
- Document image dewarping library using a cubic sheet model☆160Updated this week
- Docker Image with latest Tesseract OCR Version 5.x.x built from sources☆39Updated 3 weeks ago
- ScanTailor Universal - a fork based on Enhanced+Featured+Master versions of ST☆214Updated 3 months ago
- Logical structure analysis for visually structured documents☆90Updated 2 years ago
- Python library to extract tabular data from images and scanned PDFs☆278Updated 10 months ago
- ☆11Updated last year
- this master thesis project is based on OpenAI Whisper with the goal to transcibe interviews☆47Updated 10 months ago
- Web interface for recognizing text, proofreading OCR, and creating fully-digitized documents.☆190Updated this week
- ScanTailor Advanced is the version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones …☆243Updated last week
- Google Colab Demo of CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents☆47Updated 3 years ago
- Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pip…☆70Updated last week
- A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.☆187Updated last month
- Document Layout Analysis☆376Updated 2 weeks ago
- Object Detection Model for Scanned Documents☆93Updated 3 months ago
- PDF Table Extractor is an innovative Python project designed to tackle the challenge of extracting tables from scanned PDF documents. Lev…☆34Updated last year
- Experiment and integrate with different OCR frameworks seamlessly☆103Updated last year
- Python bindings to PDFium☆586Updated last week
- Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.☆28Updated 2 years ago
- Parallel and LAzY Analyzer for PDFs 🏖️☆31Updated last week
- A minimal Streamlit app showing how to launch and stop a FastAPI process on demand☆33Updated 2 years ago
- Data extraction with Donut ML model☆57Updated 10 months ago
- OnnxTR a docTR (Document Text Recognition) library Onnx pipeline wrapper - for seamless, high-performing & accessible OCR☆127Updated this week
- Tools to process books in a cloud based pipeline system☆61Updated 2 months ago
- Pre-Recognize Library - library with algorithms for improving OCR quality.☆106Updated 2 years ago