deajan / pmOCR
A wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR conversion on file activity
☆65Updated last year
Alternatives and similar repositories for pmOCR
Users that are interested in pmOCR are comparing it to the libraries listed below
Sorting:
- Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format☆46Updated last month
- A tiny frontend for OCRing PDF files via the web.☆49Updated 5 years ago
- Short script for removing watermarks from PDF files. Requires pdftk.☆58Updated 6 years ago
- A free tool to OCR a PDF and add a text "layer" in the original file, making a searchable PDF. Use only open source tools. Please tip!☆287Updated last year
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆188Updated last week
- ReadablePDF streamlines the effort of turning a not so great PDF into a more easily readable PDF (or of course a pretty decent PDF into a…☆33Updated 3 years ago
- Ergonomic line-by-line transcription of scanned text.☆51Updated 4 years ago
- Tool for visualizing hOCR output from Tesseract (or other OCR engines that support hOCR).☆23Updated 10 years ago
- Building scantailor and its dependencies☆58Updated last year
- The hOCR Embedded OCR Workflow and Output Format☆74Updated 9 months ago
- web interface for recoll desktop search☆287Updated 4 years ago
- WIP tag-based file organizer & search☆39Updated last year
- OCR for DjVu☆48Updated 2 years ago
- Java program to add bookmarks to pdf (stable)☆27Updated 4 years ago
- A free Windows graphical interface to the Tesseract 4.0 OCR engine.☆58Updated 3 years ago
- LibGen☆16Updated 12 years ago
- OCR evaluation brought to you by University of Alicante☆67Updated 2 years ago
- Tesseract Powered Windows Desktop OCR Application With Multiple Pre/Post Processing GUI☆42Updated last year
- A hosted version of the Word to Markdown gem☆75Updated last week
- Export / upload emails from Thunderbird mbox files to single eml files☆23Updated 2 years ago
- A python interface for genealogical tools (Geni, RootsMagic, GEDCOM, Family Search...)☆35Updated 4 years ago
- Scripts and results from our OCR roundup, available on Source☆150Updated 6 years ago
- Tool to OCR PDFs using Google Cloud Vision☆42Updated 2 years ago
- PDF to XML ALTO file converter☆238Updated this week
- PAGE XML format collection for document image page content and more☆67Updated 3 years ago
- Extract meaningful content from pdf and psd file, such as texts and images both linked into a common JSON string☆37Updated 7 years ago
- A post-processing tool for scanned sheets of paper.☆81Updated last year
- compare two PDF files, write a resulting PDF with highlighted changes☆56Updated 9 months ago
- OCRmyPDF EasyOCR plugin☆84Updated last month
- Master repository which includes most other OCR-D repositories as submodules☆73Updated last month