tesseract-ocr / langdataLinks
Source training data for Tesseract for lots of languages
☆858Updated 6 months ago
Alternatives and similar repositories for langdata
Users that are interested in langdata are comparing it to the libraries listed below
Sorting:
- Various documents related to Tesseract OCR☆265Updated 4 years ago
- Best (most accurate) trained LSTM models.☆1,426Updated last year
- Fast integer versions of trained LSTM models☆567Updated last year
- Trained models with fast variant of the "best" LSTM models + legacy models☆7,173Updated last year
- Train Tesseract LSTM with make☆698Updated 5 months ago
- Line based ATR Engine based on OCRopy☆1,165Updated 4 months ago
- Data used for LSTM model training☆122Updated last year
- Python-based tools for document analysis and OCR☆3,464Updated 4 years ago
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆397Updated last year
- A curated list of promising OCR resources☆1,693Updated 3 years ago
- OCR engine for all the languages☆894Updated this week
- finetuned traineddata files for tesseract 4.0.0 for testing☆169Updated 6 years ago
- A simple python OCR engine using opencv☆529Updated last year
- Tesseract 4 OCR Compilation - Docker Container☆55Updated 3 years ago
- Files and Scripts to run Tesseract 5 LSTM Training using fonts☆79Updated 3 years ago
- Real-time image preprocess and OCR.☆274Updated 3 years ago
- ☆146Updated 5 years ago
- Links to awesome OCR projects☆3,049Updated last year
- Tesseract Open Source OCR Engine (main repository)☆3,803Updated 3 months ago
- Java GUI frontend for Tesseract OCR engine☆69Updated 3 months ago
- An Optical Character Recognition Framework in Java☆31Updated 11 years ago
- A small framework taking over the manual training process described in the Tesseract3 Wiki: https://code.google.com/p/tesseract-ocr/wiki/…☆132Updated 2 years ago
- Detect and fix skew in images containing text☆267Updated 6 years ago
- Tesseract documentation☆2,182Updated 3 weeks ago
- Pre-Recognize Library - library with algorithms for improving OCR quality.☆109Updated 2 years ago
- Pretrained mixed models to be used with Calamari.☆65Updated last year
- Extract tables from scanned image PDFs using Optical Character Recognition.☆276Updated 5 years ago
- The module extracts text from image using the tesseract-OCR engine. Generally, text present in the images are blur or are of uneven sizes…☆149Updated 6 years ago
- A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cl…☆1,082Updated last year
- A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.☆192Updated 3 months ago