fritz-hh/OCRmyPDF

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/fritz-hh/OCRmyPDF)

fritz-hh / OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

☆260

Alternatives and similar repositories for OCRmyPDF

Users that are interested in OCRmyPDF are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

datacats / ckan-multisite
View on GitHub
Administrator interface and tools for managing CKAN Data Catalogs.
☆23Nov 5, 2015Updated 10 years ago
ryanfb / book-aligner
View on GitHub
Automatic alignment of books between HathiTrust, Internet Archive, Google Books, etc.
☆37May 8, 2026Updated last month
datanews / data-inventories
View on GitHub
A simple script to look for and process all the federal data.json data inventories.
☆46Mar 10, 2015Updated 11 years ago
tkleykamp / state-federal-datasets
View on GitHub
This is a list of various datasets that are collected by States initially and then provided to federal agencies.
☆20Dec 17, 2021Updated 4 years ago
Early-Modern-OCR / TesseractTraining
View on GitHub
Training files produced for and by the Tesseract OCR engine for work on the Early Modern OCR Project (eMOP)
☆37Sep 24, 2015Updated 10 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Early-Modern-OCR / hOCR-De-Noising
View on GitHub
code to remove "noise" from hOCR output of Tesseract OCR.
☆14Oct 24, 2016Updated 9 years ago
govex / open-data-portal-requirements
View on GitHub
Open Data Portal Requirements
☆14May 13, 2025Updated last year
opendata / Legal-Synonyms
View on GitHub
A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]
☆25Sep 14, 2016Updated 9 years ago
kindly / libflatterer
View on GitHub
Lib flatterer: A lib to make JSON flatterer
☆17May 16, 2025Updated last year
alexbeliaev / The-Local-Book-Library
View on GitHub
Assume you have a large book collection in some folder/folders, and you would like to create a database of your books, so that you can kn…
☆10Jan 7, 2014Updated 12 years ago
robbert-harms / pyschematron
View on GitHub
☆17May 22, 2026Updated last month
datanews / mean-streets
View on GitHub
Data on 268 New York City traffic deaths in 2014.
☆10Feb 19, 2015Updated 11 years ago
hcayless / appcrit
View on GitHub
Tools for working with online critical apparatus in TEI
☆11Sep 5, 2023Updated 2 years ago
idc9 / law-net
View on GitHub
code to analyze the legal citation network
☆25Sep 16, 2017Updated 8 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
opengovfoundation / legal-docs
View on GitHub
All of The OpenGov Foundation's legal docs in one externals-linked repo.
☆23Oct 30, 2015Updated 10 years ago
alexbyrnes / FCC-Political-Ads_The-Code
View on GitHub
Code for extracting data from a large number of PDFs, particularly FCC political ad documents
☆15Oct 26, 2017Updated 8 years ago
alphagov / service-manual-publisher
View on GitHub
Publishes the Service Manual on GOV.UK
☆12Updated this week
okfn / measure
View on GitHub
Measure is scripts and conventions to build KPI dashboards for projects.
☆15Jul 14, 2020Updated 5 years ago
City-of-Bloomington / OnBoard
View on GitHub
An extensible system to keep track of boards & commissions details, the people appointed to those groups, any legislation they write, and…
☆17May 28, 2026Updated last month
alphagov / notify-api
View on GitHub
Alpha for notify API. Sends emails/sms/printed content on behalf of government.
☆15Feb 8, 2016Updated 10 years ago
rogerhoward / iiify
View on GitHub
A simple Python Flask-based implementation of the IIIF Image API 2.0 standard
☆12Feb 4, 2022Updated 4 years ago
deadlyforcedb / data-recipes
View on GitHub
A small repo of notes and scripts for collecting data on U.S. deadly force police incidents
☆10Aug 9, 2015Updated 10 years ago
pudo-attic / storyweb
View on GitHub
A contextual news development environment.
☆49Dec 19, 2014Updated 11 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
18F / an_introduction_to_python
View on GitHub
An introduction to Python - https://www.digitalgov.gov/event/online-intro-to-python/
☆10Aug 2, 2017Updated 8 years ago
faradayio / utility-landscape
View on GitHub
A repository for creating and maintaining a geospatial representation of U.S. electrical utilities' service territories
☆19Nov 21, 2014Updated 11 years ago
wikimaps-dev / mapwarper
View on GitHub
Wikimedia Commons map georectifier and warper. See wikimaps_new branch.
☆11Jul 4, 2022Updated 3 years ago
18F / jekyll_frontmatter_tests
View on GitHub
A Jekyll plugin to test frontmatter on posts and other documents in a Jekyll site.
☆29Mar 24, 2017Updated 9 years ago
state-of-foss / state-of-foss.github.io
View on GitHub
Website for The State of FOSS in India report.
☆11Aug 20, 2021Updated 4 years ago
chriswhong / ReallySimpleOpenData
View on GitHub
A no-frills open data portal built with node, express, and mongodb
☆86Apr 17, 2017Updated 9 years ago
paulgreg / rsstodolist-django-server
View on GitHub
Django port of the Google App Engine rsstodolist application
☆16Jul 15, 2021Updated 4 years ago
laurenancona / data-resources
View on GitHub
Tracking the tools I've found useful
☆14Feb 28, 2017Updated 9 years ago
opensanctions / storyweb
View on GitHub
Extract networks of entities from journalistic reporting
☆49Jul 17, 2023Updated 2 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
dhcole / csv-test
View on GitHub
A command line application for validating CSV files
☆11Feb 16, 2016Updated 10 years ago
AdrieanKhisbe / omni-kill.el
View on GitHub
⚔️ M-x kill-all-the-thing ☠️
☆10Oct 16, 2017Updated 8 years ago
vmbrasseur / iadownload
View on GitHub
Download files from an Internet Archive collection or item
☆17Jun 12, 2014Updated 12 years ago
maezred / BarTab-Plus
View on GitHub
Firefox add-on: Drink now, pay later: put your tabs on your bar tab!
☆11Jul 14, 2015Updated 10 years ago
infrae / moai
View on GitHub
MOAI, an Open Access Server Platform for Institutional Repositories
☆15Apr 21, 2023Updated 3 years ago
PREreview / prereview
View on GitHub
☆12Jul 15, 2024Updated last year
jtilly / R-travis-container-example
View on GitHub
Build R Packages using Travis CI Containers
☆16Jul 7, 2017Updated 8 years ago