madmaze / pytesseract
A Python wrapper for Google Tesseract
☆5,841Updated last week
Related projects ⓘ
Alternatives and complementary repositories for pytesseract
- A Python wrapper for the tesseract-ocr API☆2,014Updated 2 months ago
- Community maintained fork of pdfminer - we fathom PDF☆5,946Updated 3 months ago
- A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files☆8,302Updated this week
- Trained models with fast variant of the "best" LSTM models + legacy models☆6,402Updated 8 months ago
- Python PDF Parser (Not actively maintained). Check out pdfminer.six.☆5,252Updated last year
- A python module that wraps the pdftoppm utility to convert PDF to PIL Image object☆1,631Updated 3 months ago
- The lxml XML toolkit for Python☆2,699Updated this week
- A Python wrapper for Tesseract and Cuneiform -- Moved to Gnome's Gitlab☆930Updated 6 years ago
- Python-based tools for document analysis and OCR☆3,420Updated 3 years ago
- Best (most accurate) trained LSTM models.☆1,240Updated 8 months ago
- Tesseract Open Source OCR Engine (main repository)☆62,200Updated this week
- PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.☆5,498Updated this week
- Create and modify Word documents with Python☆4,603Updated 2 months ago
- Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and …☆24,416Updated last month
- Headless chrome/chromium automation library (unofficial port of puppeteer)☆3,678Updated 4 months ago
- A Python library to extract tabular data from PDFs☆3,008Updated 2 months ago
- A python wrapper for libmagic☆2,636Updated 3 months ago
- Python character encoding detector☆2,177Updated 3 months ago
- MySQL client library for Python☆7,674Updated this week
- Python Imaging Library (Fork)☆12,265Updated this week
- Source training data for Tesseract for lots of languages☆837Updated 8 months ago
- Tesseract Open Source OCR Engine (main repository)☆3,140Updated this week
- A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.☆2,219Updated 2 years ago
- Lightweight, scriptable browser as a service with an HTTP API☆4,097Updated 3 months ago
- Leptonica is an open source library containing software that is broadly useful for image processing and image analysis applications. The …☆1,795Updated this week
- Links to awesome OCR projects☆2,803Updated 4 months ago
- A Python implementation of John Gruber’s Markdown with Extension support.☆3,791Updated last month
- MySQL/MariaDB connector for Python☆2,447Updated 2 weeks ago
- Wkhtmltopdf python wrapper to convert html to pdf☆1,990Updated last year
- Requests + Gevent = <3☆4,485Updated 3 months ago