sirfz / tesserocrLinks
A Python wrapper for the tesseract-ocr API
☆2,151Updated last month
Alternatives and similar repositories for tesserocr
Users that are interested in tesserocr are comparing it to the libraries listed below
Sorting:
- A Python wrapper for Google Tesseract☆6,310Updated 3 weeks ago
- A python module that wraps the pdftoppm utility to convert PDF to PIL Image object☆1,933Updated last year
- A Python wrapper for Tesseract and Cuneiform -- Moved to Gnome's Gitlab☆929Updated 7 years ago
- Python-based tools for document analysis and OCR☆3,470Updated 4 years ago
- Best (most accurate) trained LSTM models.☆1,501Updated last year
- OCR engine for all the languages☆944Updated this week
- Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.☆1,641Updated 9 months ago
- Tesseract documentation☆2,292Updated last month
- Line based ATR Engine based on OCRopy☆1,185Updated 9 months ago
- Links to awesome OCR projects☆3,085Updated last year
- extract text from any document. no muss. no fuss.☆4,434Updated last week
- Train Tesseract LSTM with make☆713Updated 9 months ago
- Source training data for Tesseract for lots of languages☆863Updated 10 months ago
- A packaged and flexible version of the CRAFT text detector and Keras CRNN recognition model.☆1,472Updated 4 months ago
- A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.☆2,252Updated 3 years ago
- 🪼 a python library for doing approximate and phonetic matching of strings.☆2,189Updated last month
- Text page dewarping using a "cubic sheet" model☆1,502Updated 2 years ago
- A simple python OCR engine using opencv☆531Updated 2 years ago
- Wkhtmltopdf python wrapper to convert html to pdf☆2,038Updated 2 years ago
- Read one-dimensional barcodes and QR codes from Python 2 and 3.☆814Updated 2 years ago
- Python PDF Parser (Not actively maintained). Check out pdfminer.six.☆5,303Updated 3 years ago
- TableBank: A Benchmark Dataset for Table Detection and Recognition☆1,082Updated last year
- Various documents related to Tesseract OCR☆267Updated 4 years ago
- Fast integer versions of trained LSTM models☆594Updated last year
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity☆1,278Updated 4 years ago
- Trained models with fast variant of the "best" LSTM models + legacy models☆7,381Updated last year
- A synthetic data generator for text recognition☆3,636Updated last year
- Send email in Python conveniently for gmail using yagmail☆2,724Updated 3 years ago
- Community maintained fork of pdfminer - we fathom PDF☆6,889Updated last week
- A Python library for reading and writing PDF, powered by QPDF☆2,633Updated last week