tesseract4java / tesseract4java
Java GUI and Tools for Tesseract OCR
☆328Updated last year
Alternatives and similar repositories for tesseract4java:
Users that are interested in tesseract4java are comparing it to the libraries listed below
- Box editor and trainer for Tesseract OCR☆236Updated 8 months ago
- Java GUI frontend for Tesseract OCR engine☆65Updated 3 weeks ago
- Java JNA wrapper for Tesseract OCR API☆1,651Updated 3 weeks ago
- Pdf2Dom is a PDF parser that converts the documents to a HTML DOM representation. The obtained DOM tree may be then serialized to a HTM…☆181Updated 2 years ago
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆187Updated last month
- JAI ImageIO Core (without javax.media.jai dependencies)☆236Updated last year
- Java OCR allows you to perform OCR and bar code recognition on images (JPEG, PNG, TIFF, PDF, etc.) and output as plain text, xml with ful…☆133Updated 9 years ago
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆388Updated 7 months ago
- Convert Word documents to simple and clean HTML☆259Updated 2 months ago
- The Open Source RTF (Rich Text Format) Java Library☆44Updated 9 months ago
- OCR evaluation brought to you by University of Alicante☆67Updated 2 years ago
- Best (most accurate) trained LSTM models.☆1,311Updated last year
- Web Browser, Flash Player, HTML editor, Media player for Swing☆197Updated 2 years ago
- The hOCR Embedded OCR Workflow and Output Format☆74Updated 7 months ago
- Automatically exported from code.google.com/p/java-html2image☆138Updated last year
- JPEG2000 support for Java Advanced Imaging Image I/O Tools API☆76Updated last year
- Java utility for parsing PDF tabular data using Apache PDFBox and OpenCV☆72Updated last year
- A Java library to convert .pdf files into .epub, .txt, .png, .jpg, .zip formats.☆210Updated 5 months ago
- Source training data for Tesseract for lots of languages☆847Updated last year
- This will demonstrate extracting text from scanned documents ( pdf, jpg, tiff, bmp, png etc)☆30Updated 8 years ago
- Java library for rendering PDF documents to the screen using Java2D☆190Updated last year
- Marvin Image Processing Framework provides features for processing images and videos in real-time.☆112Updated 2 years ago
- Java JNA Wrapper for Leptonica Image Processing Library☆30Updated 3 weeks ago
- pdfOCR is an iText 7 add-on to recognize and extract text in scanned documents and images. It can also convert them into fully ISO-compli…☆35Updated last week
- pdfHTML is an iText add-on for Java that allows you to easily convert HTML and CSS into standards compliant PDFs that are accessible, sea…☆240Updated this week
- CSSBox is an (X)HTML/CSS rendering engine written in pure Java. Its primary purpose is to provide a complete information about the render…☆243Updated 3 months ago
- documents4j is a Java library for converting documents into another document format☆569Updated last month
- The image4j library allows you to read and write certain image formats in 100% pure Java.☆80Updated last year
- Java library for reading, writing, converting and manipulating images and metadata☆205Updated 6 months ago
- A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.☆185Updated 3 months ago