madmaze / pytesseract
A Python wrapper for Google Tesseract
☆6,055Updated last month
Alternatives and similar repositories for pytesseract:
Users that are interested in pytesseract are comparing it to the libraries listed below
- A Python wrapper for the tesseract-ocr API☆2,077Updated last month
- Trained models with fast variant of the "best" LSTM models + legacy models☆6,785Updated last year
- Tesseract Open Source OCR Engine (main repository)☆65,569Updated last month
- Python-based tools for document analysis and OCR☆3,444Updated 3 years ago
- Community maintained fork of pdfminer - we fathom PDF☆6,301Updated 7 months ago
- A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files☆8,860Updated this week
- Python PDF Parser (Not actively maintained). Check out pdfminer.six.☆5,283Updated 2 years ago
- A Python wrapper for Tesseract and Cuneiform -- Moved to Gnome's Gitlab☆929Updated 6 years ago
- pdfrw is a pure Python library that reads and writes PDFs☆1,884Updated 10 months ago
- Links to awesome OCR projects☆2,930Updated 8 months ago
- Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.☆7,439Updated last month
- Best (most accurate) trained LSTM models.☆1,316Updated last year
- A python module that wraps the pdftoppm utility to convert PDF to PIL Image object☆1,740Updated 8 months ago
- Leptonica is an open source library containing software that is broadly useful for image processing and image analysis applications. The …☆1,877Updated last week
- Tesseract documentation☆1,983Updated last month
- Tesseract Open Source OCR Engine (main repository)☆3,411Updated 4 months ago
- Create and modify Word documents with Python☆4,879Updated 7 months ago
- extract text from any document. no muss. no fuss.☆4,013Updated 3 months ago
- Source training data for Tesseract for lots of languages☆850Updated last year
- PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.☆6,783Updated this week
- Python Imaging Library (Fork)☆12,648Updated this week
- Python bindings for FFmpeg - with complex filtering support☆10,377Updated 7 months ago
- Useful extensions to the standard Python datetime features☆2,433Updated last month
- Python composable command line interface toolkit☆16,167Updated last week
- The lxml XML toolkit for Python☆2,781Updated this week
- Simple cross-platform colored terminal text in Python☆3,638Updated last week
- 🏹 Better dates & times for Python☆8,822Updated 4 months ago
- A Fast, Extensible Progress Bar for Python and CLI☆29,500Updated last month
- Simple wrapper of tabula-java: extract table from PDF into pandas DataFrame☆2,241Updated 3 months ago
- Video editing with Python☆13,181Updated last month