madmaze / pytesseract
A Python wrapper for Google Tesseract
☆5,868Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for pytesseract
- A Python wrapper for the tesseract-ocr API☆2,016Updated 2 months ago
- Community maintained fork of pdfminer - we fathom PDF☆5,961Updated 3 months ago
- The lxml XML toolkit for Python☆2,706Updated 2 weeks ago
- Tesseract Open Source OCR Engine (main repository)☆62,429Updated last week
- extract text from any document. no muss. no fuss.☆3,910Updated this week
- Simple wrapper of tabula-java: extract table from PDF into pandas DataFrame☆2,193Updated last month
- A python module that wraps the pdftoppm utility to convert PDF to PIL Image object☆1,642Updated 3 months ago
- A Python library for automating interaction with websites.☆4,673Updated this week
- A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files☆8,388Updated this week
- Python PDF Parser (Not actively maintained). Check out pdfminer.six.☆5,256Updated last year
- Trained models with fast variant of the "best" LSTM models + legacy models☆6,439Updated 8 months ago
- Best (most accurate) trained LSTM models.☆1,249Updated 8 months ago
- Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and …☆24,553Updated last month
- Python job scheduling for humans.☆11,846Updated 5 months ago
- A Python wrapper for Tesseract and Cuneiform -- Moved to Gnome's Gitlab☆930Updated 6 years ago
- Headless chrome/chromium automation library (unofficial port of puppeteer)☆3,690Updated 4 months ago
- Create and modify Word documents with Python☆4,641Updated 3 months ago
- Monitor Memory usage of Python code☆4,381Updated 6 months ago
- PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.☆5,677Updated this week
- Python library and shell utilities to monitor filesystem events.☆6,629Updated this week
- Tesseract documentation☆1,837Updated this week
- Simple job queues for Python☆9,895Updated this week
- pipreqs - Generate pip requirements.txt file based on imports of any project. Looking for maintainers to move this project forward.☆6,784Updated 4 months ago
- Leptonica is an open source library containing software that is broadly useful for image processing and image analysis applications. The …☆1,804Updated this week
- Python-based tools for document analysis and OCR☆3,422Updated 3 years ago
- Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.☆1,510Updated 7 months ago
- Task scheduling library for Python☆6,295Updated this week
- A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.☆2,220Updated 2 years ago
- Lightweight, scriptable browser as a service with an HTTP API☆4,099Updated 3 months ago
- Scrapy+Splash for JavaScript integration☆3,157Updated last year