h / pytesseractLinks
Python-tesseract is an optical character recognition (OCR) tool for python
☆175Updated last week
Alternatives and similar repositories for pytesseract
Users that are interested in pytesseract are comparing it to the libraries listed below
Sorting:
- Python bindings to PDFium, reasonably cross-platform.☆659Updated this week
- Official Python SDK for Deepgram.☆359Updated this week
- Demos, examples and utilities using PyMuPDF☆686Updated last year
- The official Python Library for the Groq API☆552Updated last week
- A template repo holding our common setup for a python project☆122Updated 3 years ago
- img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing☆803Updated 2 months ago
- ☆386Updated last year
- Python binding to Poppler-cpp pdf library☆113Updated last year
- A Python client for the Unstructured Platform API☆107Updated this week
- ☆144Updated this week
- A curated list of resources around PDF files☆144Updated last year
- Benchmarking PDF libraries☆314Updated 4 months ago
- Aspose.Words for Python via .NET examples and showcases☆127Updated 3 weeks ago
- The official Roboflow Python package. Manage your datasets, models, and deployments. Roboflow has everything you need to build a computer…☆489Updated last week
- OCR library to extract text & tables from PDF files and images. Convert any image or PDF to CSV / TXT / JSON / Searchable PDF.☆120Updated 2 years ago
- Powerful handwritten text recognition. A simple-to-use, unofficial implementation of the paper "TrOCR: Transformer-based Optical Characte…☆224Updated 9 months ago
- Updating this repo every week, You may want to STAR it :)☆70Updated last year
- A python module that wraps the pdftoppm utility to convert PDF to PIL Image object☆1,887Updated last year
- pgvector support for Python☆1,354Updated 3 weeks ago
- Tesseract documentation☆2,206Updated last month
- git mirror for Beautiful Soup 4.3.2☆185Updated 2 years ago
- Python Library for Accessing the Cohere API☆369Updated last week
- A Python asyncio wrapper for Tesseract-OCR.☆26Updated last month
- OCR engine for all the languages☆903Updated this week
- 📚 Process PDFs, Word documents and more with spaCy☆784Updated 7 months ago
- ☆67Updated 2 years ago
- OnnxTR a docTR (Document Text Recognition) library Onnx pipeline wrapper - for seamless, high-performing & accessible OCR☆159Updated this week
- OCRmyPDF EasyOCR plugin☆93Updated last month
- Remove background from any image☆216Updated 7 months ago
- Source code for the Streamlit Python library documentation☆160Updated last week