eloops / hocr2pdfLinks
take scanned image, and hocr output from tesseract, create PDF. Thats it.
☆27Updated 2 years ago
Alternatives and similar repositories for hocr2pdf
Users that are interested in hocr2pdf are comparing it to the libraries listed below
Sorting:
- A free tool to OCR a PDF and add a text "layer" in the original file, making a searchable PDF. Use only open source tools. Please tip!☆299Updated 7 months ago
- `pdf2searchablepdf input.pdf` = voila! "input_searchable.pdf" is created & now has searchable text!☆135Updated 2 years ago
- A wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR …☆67Updated last year
- A post-processing tool for scanned sheets of paper.☆1,137Updated last year
- web interface for recoll desktop search☆292Updated 5 years ago
- PortableSigner - A Commandline and GUI Tool to digital sign PDF files with X.509 certificates☆122Updated 6 years ago
- Live SQLite3 database master-slave replication with sqlite3-rdiff using rsync over SSH☆40Updated 9 years ago
- PDF minifier that allows removing duplicate data, re-compresses images, creation of PDF/A-1b and digital PDF signing☆55Updated last year
- Bash Script to Scale and Resize PDFs using Ghostscript☆272Updated last year
- Extract structured data from PDF invoices☆14Updated 4 years ago
- Audio & Video chat for Etherpad - Video Conferencing with a focus on collaboration☆75Updated 2 months ago
- Export / upload emails from Thunderbird mbox files to single eml files☆23Updated 2 years ago
- fasttext with wheels and no external dependency, but only the predict method (<1MB)☆18Updated last year
- Keyper-docker is a docker image building code for Keyper SSH Key Based Authentication Manager☆32Updated 4 years ago
- An extendible and configurable PDF manipulation layer library written in java.☆533Updated 2 months ago
- A post-processing tool for scanned sheets of paper.☆85Updated last year
- Core server of the SEPIA Framework responsible for NLU, conversation, smart-service integration, user-accounts and more.☆100Updated 2 years ago
- Add annotations to a pdf on a web browser☆10Updated 9 years ago
- Python lib for Factur-X, the e-invoicing standard for France and Germany☆40Updated 7 years ago
- Index and search PDF files using Apache Lucene and PDF Box☆43Updated 2 months ago
- SQLite3 extension for read/write storage compression with Zstandard☆193Updated last year
- Code repository for PDFStitcher, a utility to stitch together and modify line properties of PDF sewing patterns.☆170Updated 3 weeks ago
- Scraper for downloading the entire ebooks repository of project Gutenberg☆155Updated this week
- Simple websocket broadcaster implemented in Rust☆60Updated 2 years ago
- A simple graphical tool to crop the pages of PDF files, written in Python/Qt☆136Updated 5 months ago
- Light and fast web spreadsheet☆86Updated 5 years ago
- OCR for DjVu☆47Updated 3 years ago
- Convert a PDF via OCR to a TXT file in UTF-8 encoding☆154Updated 2 years ago
- A personal finance tool in PHP☆18Updated 2 months ago
- An automated time tracker (WIP)☆132Updated last year