tesseract4java / tesseract4java
Java GUI and Tools for Tesseract OCR
☆324Updated 9 months ago
Related projects: ⓘ
- Pdf2Dom is a PDF parser that converts the documents to a HTML DOM representation. The obtained DOM tree may be then serialized to a HTM…☆178Updated last year
- Java JNA wrapper for Tesseract OCR API☆1,586Updated last month
- An Optical Character Recognition Framework in Java☆30Updated 10 years ago
- pdfHTML is an iText add-on for Java that allows you to easily convert HTML and CSS into standards compliant PDFs that are accessible, sea…☆226Updated 3 weeks ago
- Aspose.OCR for Java Examples and Sample Projects☆42Updated 7 months ago
- Java OCR allows you to perform OCR and bar code recognition on images (JPEG, PNG, TIFF, PDF, etc.) and output as plain text, xml with ful…☆132Updated 9 years ago
- JODConverter automates document conversions using LibreOffice/OpenOffice.org☆463Updated last year
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆363Updated last month
- Java library for rendering PDF documents to the screen using Java2D☆187Updated last year
- documents4j is a Java library for converting documents into another document format☆553Updated last month
- JAI ImageIO Core (without javax.media.jai dependencies)☆234Updated 10 months ago
- Java utility for parsing PDF tabular data using Apache PDFBox and OpenCV☆69Updated last year
- Convert Word documents to simple and clean HTML☆248Updated 2 months ago
- Plain Java unrar library☆286Updated 4 months ago
- XDocReport Samples☆54Updated 7 years ago
- Box editor and trainer for Tesseract OCR☆221Updated 2 months ago
- Java JNA Wrapper for Leptonica Image Processing Library☆27Updated 2 months ago
- A Java library to convert .pdf files into .epub, .txt, .png, .jpg, .zip formats.☆203Updated last year
- Aspose.PDF for Java examples, plugins and showcases☆127Updated last year
- Converts XHTML to OpenXML WordML (docx) using docx4j☆135Updated last month
- JPEG2000 support for Java Advanced Imaging Image I/O Tools API☆73Updated 9 months ago
- pdfOCR is an iText 7 add-on to recognize and extract text in scanned documents and images. It can also convert them into fully ISO-compli…☆30Updated this week
- Test area for public PDFBox v2 issues on stackoverflow etc☆82Updated 3 weeks ago
- Java GUI frontend for Tesseract OCR engine☆62Updated last month
- An Eclipse Plugin to integrate different Class Decompiler seamlessly into the development workflow☆262Updated 2 months ago
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆176Updated last month
- Export docx to PDF via XSL FO, using FOP☆46Updated 6 months ago
- Framework☆293Updated 3 years ago
- Automatically exported from code.google.com/p/java-html2image☆133Updated last year
- A standalone Java library/command line tool that converts DOC, DOCX, PPT, PPTX and ODT documents to PDF files.☆590Updated last year