EisenVault / install-tesseract-redhat-centosLinks
Script for downloading and installing Tesseract OCR Engine on RedHat and CentOS
☆53Updated 7 years ago
Alternatives and similar repositories for install-tesseract-redhat-centos
Users that are interested in install-tesseract-redhat-centos are comparing it to the libraries listed below
Sorting:
- Implementation of Vision Based Page Segmentation algorithm in Java☆103Updated 5 years ago
- Tesseract 4 OCR Compilation - Docker Container☆55Updated 3 years ago
- Integration between Stanford NLP and Apache Stanbol☆34Updated 9 years ago
- A fast and comprehensive Java library capable of performing automaton and non-automaton based Levenshtein distance determination and neig…☆43Updated 12 years ago
- Content Based Image Retrieval Plugin for Elasticsearch. It allows users to index images and search for similar images.☆409Updated 9 years ago
- This provides tools for b-bit MinHash algorism.☆36Updated 4 months ago
- Automatically exported from code.google.com/p/chromium-compact-language-detector☆161Updated 5 years ago
- Java text categorization system☆57Updated 8 years ago
- Pre-Recognize Library - library with algorithms for improving OCR quality.☆109Updated 2 years ago
- A Java library implementing practical nearest neighbour search algorithm for multidimensional vectors that operates in sublinear time. It…☆201Updated 5 years ago
- (Java)A Method to Extract Tabular Content from PDF Files☆335Updated 2 years ago
- Additional opennlp mapping type for elasticsearch in order to perform named entity recognition☆136Updated 9 years ago
- Files and Scripts to run Tesseract 5 LSTM Training using fonts☆79Updated 3 years ago
- This tool extracts word vectors from Lucene index.☆135Updated 7 years ago
- A reference mechanism for including content from other documents during the Elasticsearch analysis field mapping phase☆36Updated 6 years ago
- ☆16Updated 9 years ago
- Next generation OCR engine based on LSTMs.☆52Updated 7 years ago
- Carrot2 plugin for ElasticSearch☆291Updated 2 years ago
- Using latent Dirichlet allocation (LDA) in Apache Lucene☆58Updated 12 years ago
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆396Updated last year
- A small framework taking over the manual training process described in the Tesseract3 Wiki: https://code.google.com/p/tesseract-ocr/wiki/…☆132Updated 2 years ago
- OCR evaluation brought to you by University of Alicante☆68Updated 3 years ago
- Repository collecting all the submodules for the new PyTorch-based OCR System.☆142Updated 4 years ago
- A plugin for language detection in Elasticsearch using Nakatani Shuyo's language detector☆252Updated 7 years ago
- This demo uses data from TheMovieDB (TMDB) to demonstrate using Ranklib learning to rank models with Elasticsearch.☆37Updated 2 years ago
- An expandable and scalable OCR pipeline☆87Updated 7 years ago
- A simple program to extract the text from an image before performing OCR☆222Updated 5 years ago
- Detect and fix skew in images containing text☆267Updated 6 years ago
- Using OpenCV to detect and correct skew in image of text documents.☆19Updated 10 years ago
- Pdf2Dom is a PDF parser that converts the documents to a HTML DOM representation. The obtained DOM tree may be then serialized to a HTM…☆190Updated 2 years ago