tesseract4java / tesseract4javaLinks
Java GUI and Tools for Tesseract OCR
☆329Updated last year
Alternatives and similar repositories for tesseract4java
Users that are interested in tesseract4java are comparing it to the libraries listed below
Sorting:
- Java JNA wrapper for Tesseract OCR API☆1,674Updated 2 weeks ago
- pdfHTML is an iText add-on for Java that allows you to easily convert HTML and CSS into standards compliant PDFs that are accessible, sea…☆242Updated last week
- Pdf2Dom is a PDF parser that converts the documents to a HTML DOM representation. The obtained DOM tree may be then serialized to a HTM…☆185Updated 2 years ago
- Converts XHTML to OpenXML WordML (docx) using docx4j☆145Updated 2 weeks ago
- Java OCR allows you to perform OCR and bar code recognition on images (JPEG, PNG, TIFF, PDF, etc.) and output as plain text, xml with ful…☆135Updated 10 years ago
- Fast integer versions of trained LSTM models☆549Updated 10 months ago
- Aspose.PDF for Java examples, plugins and showcases☆134Updated 4 months ago
- pdfOCR is an iText 7 add-on to recognize and extract text in scanned documents and images. It can also convert them into fully ISO-compli…☆36Updated last month
- A Java library to convert .pdf files into .epub, .txt, .png, .jpg, .zip formats.☆210Updated 8 months ago
- Java utility for parsing PDF tabular data using Apache PDFBox and OpenCV☆72Updated 2 years ago
- Source training data for Tesseract for lots of languages☆857Updated 2 months ago
- Java library for rendering PDF documents to the screen using Java2D☆191Updated 2 years ago
- Convert Word documents to simple and clean HTML☆267Updated 2 weeks ago
- This will demonstrate extracting text from scanned documents ( pdf, jpg, tiff, bmp, png etc)☆30Updated 8 years ago
- JODConverter automates document conversions using LibreOffice/OpenOffice.org☆463Updated 2 years ago
- Plain Java unrar library☆297Updated last week
- JPEG2000 support for Java Advanced Imaging Image I/O Tools API☆78Updated last year
- Various documents related to Tesseract OCR☆266Updated 3 years ago
- java decaptcha☆142Updated 4 years ago
- An HTML to PDF conversion library written in Java, based on wkhtmltopdf.☆182Updated 6 years ago
- Aspose.OCR for Java Examples and Sample Projects☆43Updated last year
- documents4j is a Java library for converting documents into another document format☆577Updated 4 months ago
- The Open Source RTF (Rich Text Format) Java Library☆45Updated last year
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆394Updated 10 months ago
- edit a docx using CKEditor via XHTML round trip (with some session state)☆47Updated 7 years ago
- Java OCR 识别组件(基于Tesseract OCR 引擎)。能自动完成图片清理、识别 CAPTCHA 验证码图片内容的一体化工作。Java Image cleanup, OCR recognition component (based Tesseract OCR e…☆619Updated 3 years ago
- JAI ImageIO Core (without javax.media.jai dependencies)☆243Updated last year
- Web Browser, Flash Player, HTML editor, Media player for Swing☆198Updated 2 years ago
- An Optical Character Recognition Framework in Java☆31Updated 11 years ago
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆189Updated last month