OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
☆260Jan 19, 2016Updated 10 years ago
Alternatives and similar repositories for OCRmyPDF
Users that are interested in OCRmyPDF are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Administrator interface and tools for managing CKAN Data Catalogs.☆23Nov 5, 2015Updated 10 years ago
- Automatic alignment of books between HathiTrust, Internet Archive, Google Books, etc.☆37May 8, 2026Updated last month
- A simple script to look for and process all the federal data.json data inventories.☆46Mar 10, 2015Updated 11 years ago
- This is a list of various datasets that are collected by States initially and then provided to federal agencies.☆20Dec 17, 2021Updated 4 years ago
- Training files produced for and by the Tesseract OCR engine for work on the Early Modern OCR Project (eMOP)☆37Sep 24, 2015Updated 10 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- code to remove "noise" from hOCR output of Tesseract OCR.☆14Oct 24, 2016Updated 9 years ago
- Open Data Portal Requirements☆14May 13, 2025Updated last year
- A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]☆25Sep 14, 2016Updated 9 years ago
- Lib flatterer: A lib to make JSON flatterer☆17May 16, 2025Updated last year
- Assume you have a large book collection in some folder/folders, and you would like to create a database of your books, so that you can kn…☆10Jan 7, 2014Updated 12 years ago
- ☆17May 22, 2026Updated last month
- Data on 268 New York City traffic deaths in 2014.☆10Feb 19, 2015Updated 11 years ago
- Tools for working with online critical apparatus in TEI☆11Sep 5, 2023Updated 2 years ago
- code to analyze the legal citation network☆25Sep 16, 2017Updated 8 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- All of The OpenGov Foundation's legal docs in one externals-linked repo.☆23Oct 30, 2015Updated 10 years ago
- Code for extracting data from a large number of PDFs, particularly FCC political ad documents☆15Oct 26, 2017Updated 8 years ago
- Publishes the Service Manual on GOV.UK☆12Updated this week
- Measure is scripts and conventions to build KPI dashboards for projects.☆15Jul 14, 2020Updated 5 years ago
- An extensible system to keep track of boards & commissions details, the people appointed to those groups, any legislation they write, and…☆17May 28, 2026Updated last month
- Alpha for notify API. Sends emails/sms/printed content on behalf of government.☆15Feb 8, 2016Updated 10 years ago
- A simple Python Flask-based implementation of the IIIF Image API 2.0 standard☆12Feb 4, 2022Updated 4 years ago
- A small repo of notes and scripts for collecting data on U.S. deadly force police incidents☆10Aug 9, 2015Updated 10 years ago
- A contextual news development environment.☆49Dec 19, 2014Updated 11 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- An introduction to Python - https://www.digitalgov.gov/event/online-intro-to-python/☆10Aug 2, 2017Updated 8 years ago
- A repository for creating and maintaining a geospatial representation of U.S. electrical utilities' service territories☆19Nov 21, 2014Updated 11 years ago
- Wikimedia Commons map georectifier and warper. See wikimaps_new branch.☆11Jul 4, 2022Updated 3 years ago
- A Jekyll plugin to test frontmatter on posts and other documents in a Jekyll site.☆29Mar 24, 2017Updated 9 years ago
- Website for The State of FOSS in India report.☆11Aug 20, 2021Updated 4 years ago
- A no-frills open data portal built with node, express, and mongodb☆86Apr 17, 2017Updated 9 years ago
- Django port of the Google App Engine rsstodolist application☆16Jul 15, 2021Updated 4 years ago
- Tracking the tools I've found useful☆14Feb 28, 2017Updated 9 years ago
- Extract networks of entities from journalistic reporting☆49Jul 17, 2023Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- A command line application for validating CSV files☆11Feb 16, 2016Updated 10 years ago
- ⚔️ M-x kill-all-the-thing ☠️☆10Oct 16, 2017Updated 8 years ago
- Download files from an Internet Archive collection or item☆17Jun 12, 2014Updated 12 years ago
- Firefox add-on: Drink now, pay later: put your tabs on your bar tab!☆11Jul 14, 2015Updated 10 years ago
- MOAI, an Open Access Server Platform for Institutional Repositories☆15Apr 21, 2023Updated 3 years ago
- ☆12Jul 15, 2024Updated last year
- Build R Packages using Travis CI Containers☆16Jul 7, 2017Updated 8 years ago