A free tool to OCR a PDF and add a text "layer" in the original file, making a searchable PDF. Use only open source tools. Please tip!
☆303May 24, 2026Updated 3 weeks ago
Alternatives and similar repositories for pdf2pdfocr
Users that are interested in pdf2pdfocr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- `pdf2searchablepdf input.pdf` = voila! "input_searchable.pdf" is created & now has searchable text!☆137Aug 2, 2023Updated 2 years ago
- A free Windows graphical interface to the Tesseract 4.0 OCR engine.☆61Feb 16, 2022Updated 4 years ago
- ☆10Mar 16, 2023Updated 3 years ago
- Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format☆47Mar 31, 2025Updated last year
- Python script to do PDF OCR conversion using Tesseract☆371Jun 2, 2023Updated 3 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- xState-based validation tool for OCF files☆15Updated this week
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆413Aug 10, 2024Updated last year
- A wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR …☆67Jan 6, 2024Updated 2 years ago
- OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched☆33,815Jun 7, 2026Updated last week
- A simple demo showing how to use the Ideogram inpainting model on Replicate using Node.js.☆16Oct 24, 2024Updated last year
- Tool to OCR PDFs using Google Cloud Vision☆42Dec 7, 2022Updated 3 years ago
- Dead Sea Scrolls in TF format based on Abegg's data☆31Apr 22, 2026Updated last month
- Onchain cap table management with an offchain SEC transfer agent-compliant DB.☆16May 25, 2026Updated 2 weeks ago
- postcorrection web☆12Mar 6, 2023Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Client library for OpenOCR☆32Dec 3, 2014Updated 11 years ago
- Configuration files for Unbound as a caching DNS server with DNSSEC validation and DNS over TLS forwarding.☆13Jan 13, 2019Updated 7 years ago
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆202May 21, 2025Updated last year
- A set of tools for rotating, cropping, and binding the images from a scanned book into a PDF.☆20Aug 15, 2018Updated 7 years ago
- A post-processing tool for scanned sheets of paper.☆1,185Jul 11, 2024Updated last year
- Visual, page-by-page comparison of two PDF files☆21Apr 7, 2014Updated 12 years ago
- OCR engine for all the languages☆1,008Jun 5, 2026Updated last week
- framework and tools for statically-generated and dynamic online reading environments☆14Jun 12, 2017Updated 9 years ago
- Tools to process books in a cloud based pipeline system☆64May 28, 2026Updated 2 weeks ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Run tesseract with the tesserocr bindings with @OCR-D's interfaces☆39Jun 5, 2026Updated last week
- This is a project library for Google Apps Script (GAS).☆12Jan 29, 2018Updated 8 years ago
- guides and test data for OCR4all☆32Oct 4, 2022Updated 3 years ago
- Write beautifully short contract. https://reference.legal/ is a referenceable clause library to standardize contracts once and for all.☆13Jul 12, 2022Updated 3 years ago
- jwt rest api using realworld spec and google apps script☆14Jan 5, 2023Updated 3 years ago
- A web app for transliterating Hebrew☆18May 28, 2026Updated 2 weeks ago
- MRE: A web framework written in Rust☆63Oct 14, 2012Updated 13 years ago
- ☆16Feb 16, 2023Updated 3 years ago
- A low-code microservices platform designed for legal engineers. Given a document, Gremlin will apply a series of Python scripts to it and…☆33May 25, 2022Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- API client for fetching and comparing passages from legislation☆14Updated this week
- Trained BERT and Word2Vec legal clause classifiers for SPACY using the Atticus Project's Open Source Contract Label Corpus☆13Jan 2, 2021Updated 5 years ago
- Project Documentation☆12Jan 21, 2016Updated 10 years ago
- (superseded by monorepo) CLI for working with the Revolt stack.☆11Apr 29, 2022Updated 4 years ago
- ☆13Dec 8, 2022Updated 3 years ago
- ☆17Jun 24, 2021Updated 4 years ago
- Log database for orbit-db☆16Jan 18, 2023Updated 3 years ago