h / pytesseractLinks
Python-tesseract is an optical character recognition (OCR) tool for python
☆179Updated 2 months ago
Alternatives and similar repositories for pytesseract
Users that are interested in pytesseract are comparing it to the libraries listed below
Sorting:
- Python bindings to PDFium, reasonably cross-platform.☆699Updated last week
- Demos, examples and utilities using PyMuPDF☆693Updated last year
- The official Python Library for the Groq API☆565Updated this week
- A curated list of resources around PDF files☆147Updated last year
- Simple package to extract text with coordinates from programmatic PDFs☆226Updated last month
- Aspose.Words for Python via .NET examples and showcases☆131Updated 3 weeks ago
- img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing☆839Updated last month
- ☆389Updated last year
- Extract structured text from pdfs quickly☆641Updated 6 months ago
- A Python client for the Unstructured Platform API☆111Updated this week
- Streamlit PDF viewer☆191Updated 2 weeks ago
- Official Python client library for LinkedIn APIs☆240Updated last year
- Recognition of handwritten text using CRAFT text detection and TrOCR☆26Updated 3 years ago
- Official Python SDK for Deepgram.☆379Updated last week
- A python module that wraps the pdftoppm utility to convert PDF to PIL Image object☆1,928Updated last year
- ☆174Updated last month
- Python Library for Accessing the Cohere API☆376Updated 2 weeks ago
- Library used to deskew a scanned document☆495Updated this week
- A python library to define and validate data types in Docling.☆217Updated 2 weeks ago
- OnnxTR a docTR (Document Text Recognition) library Onnx pipeline wrapper - for seamless, high-performing & accessible OCR☆164Updated 2 weeks ago
- A Python tool to help extracting information from structured PDFs.☆427Updated 2 weeks ago
- Access and change cookies from your Streamlit script☆78Updated last year
- PyMuPDF4LLM☆1,194Updated 3 weeks ago
- Powerful handwritten text recognition. A simple-to-use, unofficial implementation of the paper "TrOCR: Transformer-based Optical Characte…☆235Updated 11 months ago
- ☆66Updated 2 years ago
- Object Detection Model for Scanned Documents☆93Updated 9 months ago
- 📚 Process PDFs, Word documents and more with spaCy☆832Updated 9 months ago
- Detect and read handwritten words on scanned pages.☆134Updated 2 years ago
- Python library to extract tabular data from images and scanned PDFs☆286Updated last year
- ☆199Updated this week