h / pytesseract
Python-tesseract is an optical character recognition (OCR) tool for python
☆140Updated 6 years ago
Alternatives and similar repositories for pytesseract
Users that are interested in pytesseract are comparing it to the libraries listed below
Sorting:
- Detect and read handwritten words on scanned pages.☆119Updated last year
- OnnxTR a docTR (Document Text Recognition) library Onnx pipeline wrapper - for seamless, high-performing & accessible OCR☆109Updated this week
- A template repo holding our common setup for a python project☆104Updated 2 years ago
- Python bindings to PDFium☆571Updated this week
- A simple tool for automatic image annotation using Roboflow API☆46Updated last year
- Updating this repo every week, You may want to STAR it :)☆67Updated 9 months ago
- Ultralytics Notebooks 🚀☆78Updated last week
- ☆112Updated 5 months ago
- ☆82Updated 6 months ago
- Simple package to extract text with coordinates from programmatic PDFs☆122Updated last month
- ☆14Updated last year
- Passively collect images for computer vision datasets on the edge.☆33Updated last year
- OCR engine for all the languages☆826Updated this week
- Powerful handwritten text recognition. A simple-to-use, unofficial implementation of the paper "TrOCR: Transformer-based Optical Characte…☆202Updated 4 months ago
- A comprehensive tutorial for OCR in python using Tesseract-OCR and OpenCV☆120Updated 3 years ago
- Library used to deskew a scanned document☆461Updated 2 weeks ago
- This PyTorch implementation of LayoutLM paper by Microsoft demonstrate the SequenceClassfication task using HuggingFaceTransformers to cl…☆34Updated 2 years ago
- ☆153Updated this week
- ☆16Updated 4 years ago
- Image comparison slider component for Streamlit☆245Updated 10 months ago
- A Python client for the Unstructured Platform API☆101Updated this week
- Recognition of handwritten text using CRAFT text detection and TrOCR☆26Updated 2 years ago
- Apply different text recognition services to images of handwritten documents.☆178Updated 2 years ago
- Streamlit component for invoice document labeling☆61Updated 2 years ago
- Collection of PDF parsing libraries like AI based docling, claude, openai, llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumb…☆72Updated 3 weeks ago
- BoxDetect is a Python package based on OpenCV which allows you to easily detect rectangular shapes like character or checkbox boxes on sc…☆109Updated 2 years ago
- ☆40Updated last month
- Benchmarking PDF libraries☆276Updated last year
- A Python asyncio wrapper for Tesseract-OCR.☆26Updated 6 months ago
- Data extraction with LLM on CPU☆266Updated last year