gkovacs / pdfocr
Adds text to PDF files using the cuneiform OCR software
☆326Updated 4 years ago
Alternatives and similar repositories for pdfocr:
Users that are interested in pdfocr are comparing it to the libraries listed below
- OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched☆260Updated 9 years ago
- Python script to do PDF OCR conversion using Tesseract☆374Updated last year
- A small utility making use of the pypdf library to provide a (somewhat) lighter alternative to pdftk☆288Updated last year
- Extract tables from PDF files☆356Updated 8 years ago
- rsync backup with a bit of magic, decting moved and renamed files☆63Updated 8 months ago
- Pi Scan is a simple, robust capture appliance for book scanners. It runs on a Raspberry Pi 2.☆274Updated 7 years ago
- Modular workflow assistant for book digitization☆125Updated 8 years ago
- A post-processing tool for scanned sheets of paper.☆1,067Updated 9 months ago
- The hOCR Embedded OCR Workflow and Output Format☆74Updated 8 months ago
- Semantic filesystem for Linux, with relation reasoner, autotagging plugins and a deduplication service☆316Updated 6 years ago
- ZBackup, a versatile deduplicating backup tool☆840Updated 2 years ago
- Deduplicating backup program☆1,104Updated 3 years ago
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆389Updated 8 months ago
- smoothscan is a tool to convert scanned text into a vectorized output form.☆67Updated 11 years ago
- For those about to RIP - a Unix CD ripper preferring accuracy over speed☆303Updated 4 years ago
- Create a git repository from the revision history of a document in Google Drive.☆134Updated 7 years ago
- a modern, minimalist javascript photo gallery☆251Updated 6 years ago
- A wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR …☆65Updated last year
- Bash Script to Scale and Resize PDFs using Ghostscript☆262Updated 8 months ago
- Read-only mirror of https://gitlab.gnome.org/GNOME/ocrfeeder☆86Updated last month
- Store and restore metadata from a filesystem.☆173Updated last year
- Apple's Time Machine fuse read only file system☆256Updated last year
- Parallel Processing Shell Script☆108Updated 5 years ago
- Industry supported, open source PDF/A validation library☆289Updated 2 weeks ago
- web interface for recoll desktop search☆285Updated 4 years ago
- a python script for downloading ebooks from springerlink.com☆116Updated 12 years ago
- ScanTailor Universal - a fork based on Enhanced+Featured+Master versions of ST☆210Updated 3 weeks ago
- Web based JavaScript GUI library for proofreading/editing hOCR☆95Updated 6 years ago
- python app/framework for 'all things ISBN' including metadata, descriptions, covers...☆225Updated last year
- Free and simple TrueCrypt/VeraCrypt Implementation based on dm-crypt☆553Updated 11 months ago