ocrmypdf / OCRmyPDF-EasyOCR
OCRmyPDF EasyOCR plugin
☆53Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for OCRmyPDF-EasyOCR
- Web interface for recognizing text, proofreading OCR, and creating fully-digitized documents.☆110Updated this week
- Run OCR, extract information from documents and classify them. In addition, annotate documents and build custom NLP and computer vision m…☆62Updated last week
- Ergonomic line-by-line transcription of scanned text.☆48Updated 3 years ago
- Master repository which includes most other OCR-D repositories as submodules☆72Updated last month
- Document image dewarping library using a cubic sheet model☆117Updated this week
- Building scantailor and its dependencies☆55Updated last year
- Fast PDF generation and compression. Deals with millions of pages daily.☆102Updated 3 months ago
- Logical structure analysis for visually structured documents☆84Updated 2 years ago
- An OCR evaluation tool☆64Updated last month
- A post-processing tool for scanned sheets of paper.☆73Updated 8 months ago
- `pdfstructure` detects, splits and organizes the documents text content into its natural structure as envisioned by the author.☆101Updated 7 months ago
- Conversions between various OCR formats☆71Updated last year
- Provides OCR (Optical Character Recognition) services through web applications☆239Updated 9 months ago
- Recognize text using Calamari OCR and the OCR-D framework☆13Updated 3 weeks ago
- Convert a PDF via OCR to a TXT file in UTF-8 encoding☆141Updated last year
- This repository contains code for line detection, character detection and recognition on the cuneiform 2d images☆30Updated 5 years ago
- Analyze XML extracted from PDFs (e.g. from TET or PDFMiner)☆20Updated 6 years ago
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GRO…☆44Updated 3 months ago
- OCR-D python tools☆33Updated 3 months ago
- OCR evaluation brought to you by University of Alicante☆67Updated 2 years ago
- 🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based☆299Updated last year
- ☆15Updated 3 years ago
- Integrate AI-powered Document Analysis Pipelines☆62Updated this week
- Detect and read handwritten words on scanned pages.☆106Updated last year
- Docker Image with latest Tesseract OCR Version 5.x.x built from sources☆30Updated last week
- Pure-python library for adding annotations to PDFs☆198Updated 3 years ago
- Document Image Binarization☆73Updated last month
- Python library to extract tabular data from images and scanned PDFs☆264Updated 3 months ago
- This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified an…☆22Updated 4 years ago
- Training scripts for Argos Translate☆123Updated last week