sirfz / tesserocr
A Python wrapper for the tesseract-ocr API
☆2,092Updated this week
Alternatives and similar repositories for tesserocr:
Users that are interested in tesserocr are comparing it to the libraries listed below
- A Python wrapper for Google Tesseract☆6,089Updated last month
- Python-based tools for document analysis and OCR☆3,449Updated 3 years ago
- A python module that wraps the pdftoppm utility to convert PDF to PIL Image object☆1,770Updated 9 months ago
- Best (most accurate) trained LSTM models.☆1,334Updated last year
- A packaged and flexible version of the CRAFT text detector and Keras CRNN recognition model.☆1,450Updated 9 months ago
- A Python wrapper for Tesseract and Cuneiform -- Moved to Gnome's Gitlab☆928Updated 6 years ago
- Line based ATR Engine based on OCRopy☆1,134Updated 3 weeks ago
- Source training data for Tesseract for lots of languages☆857Updated last month
- Train Tesseract LSTM with make☆673Updated 3 weeks ago
- extract text from any document. no muss. no fuss.☆4,112Updated 5 months ago
- A synthetic data generator for text recognition☆3,467Updated 9 months ago
- Tesseract documentation☆2,025Updated 3 months ago
- Python PDF Parser (Not actively maintained). Check out pdfminer.six.☆5,290Updated 2 years ago
- OCR engine for all the languages☆822Updated last week
- docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.☆4,622Updated last week
- Community maintained fork of pdfminer - we fathom PDF☆6,431Updated last week
- A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.☆2,239Updated 2 years ago
- Read one-dimensional barcodes and QR codes from Python 2 and 3.☆764Updated last year
- The ctypes-based simple ImageMagick binding for Python☆1,443Updated last month
- A small C++ implementation of LSTM networks, focused on OCR.☆825Updated 5 years ago
- Visual Attention based OCR☆1,117Updated 6 years ago
- A curated list of promising OCR resources☆1,681Updated 2 years ago
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆391Updated 8 months ago
- Official implementation of Character Region Awareness for Text Detection (CRAFT)☆3,239Updated 9 months ago
- A Python library to extract tabular data from PDFs☆3,284Updated this week
- Simple wrapper of tabula-java: extract table from PDF into pandas DataFrame☆2,248Updated 5 months ago
- A Python Perceptual Image Hashing Module☆3,602Updated 3 weeks ago
- fast python port of arc90's readability tool, updated to match latest readability.js!☆2,768Updated this week
- python parser for human readable dates☆2,652Updated last month
- Fast integer versions of trained LSTM models☆534Updated 9 months ago