A free tool to OCR a PDF and add a text "layer" in the original file, making a searchable PDF. Use only open source tools. Please tip!
☆303May 25, 2025Updated 9 months ago
Alternatives and similar repositories for pdf2pdfocr
Users that are interested in pdf2pdfocr are comparing it to the libraries listed below
Sorting:
- `pdf2searchablepdf input.pdf` = voila! "input_searchable.pdf" is created & now has searchable text!☆137Aug 2, 2023Updated 2 years ago
- A free Windows graphical interface to the Tesseract 4.0 OCR engine.☆61Feb 16, 2022Updated 4 years ago
- ☆10Mar 16, 2023Updated 3 years ago
- Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format☆46Mar 31, 2025Updated 11 months ago
- xState-based validation tool for OCF files☆15Apr 10, 2025Updated 11 months ago
- A wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR …☆67Jan 6, 2024Updated 2 years ago
- gcv2hocr converts from Google Cloud Vision OCR output to hocr to make a searchable pdf.☆106Oct 22, 2020Updated 5 years ago
- Dead Sea Scrolls in TF format based on Abegg's data☆25Jan 29, 2026Updated last month
- OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched☆32,941Mar 12, 2026Updated last week
- A simple library for segmenting legal texts☆18Apr 22, 2023Updated 2 years ago
- Tool to OCR PDFs using Google Cloud Vision☆42Dec 7, 2022Updated 3 years ago
- Onchain cap table management with an offchain SEC transfer agent-compliant DB.☆15Mar 1, 2026Updated 2 weeks ago
- postcorrection web☆12Mar 6, 2023Updated 3 years ago
- Client library for OpenOCR☆31Dec 3, 2014Updated 11 years ago
- Automatically exported from code.google.com/p/osis-converters☆13Updated this week
- Configuration files for Unbound as a caching DNS server with DNSSEC validation and DNS over TLS forwarding.☆13Jan 13, 2019Updated 7 years ago
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆201May 21, 2025Updated 9 months ago
- A set of tools for rotating, cropping, and binding the images from a scanned book into a PDF.☆19Aug 15, 2018Updated 7 years ago
- A post-processing tool for scanned sheets of paper.☆1,159Jul 11, 2024Updated last year
- Visual, page-by-page comparison of two PDF files☆21Apr 7, 2014Updated 11 years ago
- Extract tables from scanned image PDFs using Optical Character Recognition.☆277Jun 9, 2020Updated 5 years ago
- Statistical/Machine Learning using Randomized and Quasi-Randomized (neural) networks (currently Python & R)☆20Updated this week
- NLP Web API for Legal Text☆18Dec 23, 2022Updated 3 years ago
- OCR engine for all the languages☆964Mar 10, 2026Updated last week
- Components library for Revolt.☆11Jun 11, 2023Updated 2 years ago
- Fritzing Part: WeMos D1 Mini☆13Feb 10, 2016Updated 10 years ago
- Run tesseract with the tesserocr bindings with @OCR-D's interfaces☆39Apr 30, 2025Updated 10 months ago
- Little api client for paperless(-ngx): pypaperless☆88Updated this week
- Snipline CLI is the command-line tool for Snipline☆24Mar 5, 2021Updated 5 years ago
- A web app for transliterating Hebrew☆17Feb 28, 2026Updated 2 weeks ago
- This is a project library for Google Apps Script (GAS).☆12Jan 29, 2018Updated 8 years ago
- Internet Chess ToolKit is a java based set of libraries and widgets useful for performing common tasks such as reading PGN, FEN, and gene…☆12Feb 22, 2017Updated 9 years ago
- jwt rest api using realworld spec and google apps script☆14Jan 5, 2023Updated 3 years ago
- A post-processing tool for scanned sheets of paper.☆85Mar 9, 2024Updated 2 years ago
- Web-based instruction in Biblical Hebrew and Greek☆28Mar 9, 2026Updated last week
- API client for fetching and comparing passages from legislation☆14Jan 26, 2025Updated last year
- Trained BERT and Word2Vec legal clause classifiers for SPACY using the Atticus Project's Open Source Contract Label Corpus☆13Jan 2, 2021Updated 5 years ago
- (superseded by monorepo) CLI for working with the Revolt stack.☆11Apr 29, 2022Updated 3 years ago
- ☆255Mar 12, 2026Updated last week