A free tool to OCR a PDF and add a text "layer" in the original file, making a searchable PDF. Use only open source tools. Please tip!
☆303May 24, 2026Updated last month
Alternatives and similar repositories for pdf2pdfocr
Users that are interested in pdf2pdfocr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- `pdf2searchablepdf input.pdf` = voila! "input_searchable.pdf" is created & now has searchable text!☆137Aug 2, 2023Updated 2 years ago
- Convert a PDF via OCR to a TXT file in UTF-8 encoding☆160Oct 3, 2023Updated 2 years ago
- A free Windows graphical interface to the Tesseract 4.0 OCR engine.☆61Feb 16, 2022Updated 4 years ago
- ☆10Mar 16, 2023Updated 3 years ago
- Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format☆47Mar 31, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Python script to do PDF OCR conversion using Tesseract☆371Jun 2, 2023Updated 3 years ago
- xState-based validation tool for OCF files☆15Jun 27, 2026Updated last week
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆415Aug 10, 2024Updated last year
- A wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR …☆67Jan 6, 2024Updated 2 years ago
- gcv2hocr converts from Google Cloud Vision OCR output to hocr to make a searchable pdf.☆108Oct 22, 2020Updated 5 years ago
- A simple library for segmenting legal texts☆18Apr 22, 2023Updated 3 years ago
- OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched☆33,987Jun 27, 2026Updated last week
- A simple demo showing how to use the Ideogram inpainting model on Replicate using Node.js.☆16Oct 24, 2024Updated last year
- Tool to OCR PDFs using Google Cloud Vision☆42Dec 7, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Dead Sea Scrolls in TF format based on Abegg's data☆31Apr 22, 2026Updated 2 months ago
- Ever wanted to use custom discord emojis on other servers, without a nitro subscription? Well, with this script, YOU CAN without needing …☆26Jan 8, 2021Updated 5 years ago
- Onchain cap table management with an offchain SEC transfer agent-compliant DB.☆16Jun 27, 2026Updated last week
- postcorrection web☆12Mar 6, 2023Updated 3 years ago
- Quartz Filters for MacOS, providing transformations to PDF files.☆16May 3, 2022Updated 4 years ago
- Client library for OpenOCR☆32Dec 3, 2014Updated 11 years ago
- Automatically exported from code.google.com/p/osis-converters☆13Jun 6, 2026Updated 3 weeks ago
- Configuration files for Unbound as a caching DNS server with DNSSEC validation and DNS over TLS forwarding.☆13Jan 13, 2019Updated 7 years ago
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆204May 21, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A set of tools for rotating, cropping, and binding the images from a scanned book into a PDF.☆20Aug 15, 2018Updated 7 years ago
- A post-processing tool for scanned sheets of paper.☆1,190Jul 11, 2024Updated last year
- Visual, page-by-page comparison of two PDF files☆21Apr 7, 2014Updated 12 years ago
- Extract tables from scanned image PDFs using Optical Character Recognition.☆277Jun 9, 2020Updated 6 years ago
- NLP Web API for Legal Text☆19Dec 23, 2022Updated 3 years ago
- Technical Committee Documents☆17Updated this week
- Hyperbox Client☆13Dec 27, 2021Updated 4 years ago
- OCR engine for all the languages☆1,022Jun 26, 2026Updated last week
- framework and tools for statically-generated and dynamic online reading environments☆14Jun 12, 2017Updated 9 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Tools to process books in a cloud based pipeline system☆64May 28, 2026Updated last month
- A mirror of https://git.tecosaur.net/tec/pdftotext.el☆12Jan 4, 2024Updated 2 years ago
- Code for several utilities for use with VIVO☆11Nov 15, 2012Updated 13 years ago
- This is a project library for Google Apps Script (GAS).☆12Jan 29, 2018Updated 8 years ago
- Write beautifully short contract. https://reference.legal/ is a referenceable clause library to standardize contracts once and for all.☆13Jul 12, 2022Updated 3 years ago
- jwt rest api using realworld spec and google apps script☆14Jan 5, 2023Updated 3 years ago
- A web app for transliterating Hebrew☆18Jun 24, 2026Updated last week