Franky1 / Tesseract-OCR-5-DockerLinks
Docker Image with latest Tesseract OCR Version 5.x.x built from sources
☆45Updated 3 weeks ago
Alternatives and similar repositories for Tesseract-OCR-5-Docker
Users that are interested in Tesseract-OCR-5-Docker are comparing it to the libraries listed below
Sorting:
- OCRmyPDF EasyOCR plugin☆93Updated last month
- mrkdwn_analysis is a Python library for analyzing Markdown files. It extracts and categorizes Markdown elements like headers, sections, l…☆44Updated last week
- Detect and read handwritten words on scanned pages.☆134Updated 2 years ago
- OnnxTR a docTR (Document Text Recognition) library Onnx pipeline wrapper - for seamless, high-performing & accessible OCR☆158Updated last week
- Library used to deskew a scanned document☆491Updated this week
- Document image dewarping library using a cubic sheet model☆178Updated last week
- Demos, examples and utilities using PyMuPDF☆686Updated last year
- Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pip…☆74Updated this week
- ICIP 2022: Adaptive Radial Projection on Fourier Magnitude Spectrum for Document Image Skew Estimation☆148Updated 5 months ago
- Python bindings to PDFium, reasonably cross-platform.☆661Updated this week
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆402Updated last year
- Sentence Transformers API: An OpenAI compatible embedding API server☆68Updated last year
- OCR using Python, Tesseract and OpenCV in a Docker container☆125Updated 2 years ago
- Python bindings to connect to a LibreTranslate API☆116Updated 8 months ago
- Powerful handwritten text recognition. A simple-to-use, unofficial implementation of the paper "TrOCR: Transformer-based Optical Characte…☆224Updated 10 months ago
- Toolkit for training/converting LibreTranslate compatible language models 🚂☆65Updated 4 months ago
- Object Detection Model for Scanned Documents☆94Updated 8 months ago
- Python API for PDF documents☆124Updated last year
- Convert a PDF via OCR to a TXT file in UTF-8 encoding☆152Updated 2 years ago
- A Python asyncio wrapper for Tesseract-OCR.☆26Updated last month
- A performant high-throughput CPU-based API for Meta's No Language Left Behind (NLLB) using CTranslate2, hosted on Hugging Face Spaces.☆127Updated this week
- Split and analyze text files using langchain and streamlit☆50Updated last year
- whisper-cpp-serve Real-time speech recognition and c+ of OpenAI's Whisper model in C/C++☆71Updated last year
- Python binding to Poppler-cpp pdf library☆113Updated last year
- ☆67Updated 2 years ago
- A free tool to OCR a PDF and add a text "layer" in the original file, making a searchable PDF. Use only open source tools. Please tip!☆298Updated 5 months ago
- ☆162Updated last week
- Image pre-processing and OCR techniques with OpenCV and PyTesseract☆24Updated 3 years ago
- Deidentify people's names and gender specific pronouns☆43Updated 6 months ago
- Record audio and save a transcription to your system's clipboard with ctranslate2 and faster-whisper.☆155Updated last month