fritz-hh / OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
☆261Updated 9 years ago
Alternatives and similar repositories for OCRmyPDF
Users that are interested in OCRmyPDF are comparing it to the libraries listed below
Sorting:
- A toolbox and web application for working with and presenting textual material from Shakespeare to Schopenhauer, and letters to literatur…☆149Updated 10 years ago
- neonion is a user-centered collaborative semantic annotation webapp developed at the Human-Centered Computing group at Freie Universität …☆68Updated 6 years ago
- Check out https://github.com/webrecorder/webrecorder for newer version matching https://webrecorder.io☆38Updated 9 years ago
- Create a git repository from the revision history of a document in Google Drive.☆134Updated 7 years ago
- a NodeJS library for monitoring changes on Wikipedia sites☆70Updated 3 years ago
- DEPRECATED. This repository is no longer maintained. Please fork and work away.☆122Updated 10 years ago
- Original 2016 take at what is now Linked Paths, the demonstrator for GeoJSON-T developed under a Pelagios micro-grant☆89Updated 8 years ago
- Modular workflow assistant for book digitization☆125Updated 9 years ago
- Turns legal citations in the DOM into links☆20Updated 8 years ago
- ☆31Updated 2 years ago
- A visual implementation of individual U.S. taxes☆39Updated 3 weeks ago
- Breve☆28Updated 5 years ago
- Analyzing the April 2016 Data about the Usage of Sci-Hub☆28Updated 8 years ago
- A simple OpenRefine reconciliation service that runs on top of a CSV file☆120Updated 9 years ago
- command line resource for working with digital primary sources☆27Updated 6 years ago
- An online annotation platform for teaching and learning in the humanities.☆107Updated 3 months ago
- Scripts to create git repositories for ALTO XML texts, like those from the British Library's scanned documents.☆31Updated 7 years ago
- “Let Me Get That Data For You” catalogs the machine-readable data on a given domain name. [RETIRED]☆102Updated 10 years ago
- Pathways Project☆14Updated 9 years ago
- Automatic alignment of books between HathiTrust, Internet Archive, Google Books, etc.☆35Updated 3 weeks ago
- Extract tables from PDF files☆356Updated 8 years ago
- Parser and standardizer for politician, individual and organization names.☆129Updated 7 years ago
- MOVED TO https://gitlab.com/crossref/pdfmark☆33Updated 6 years ago
- a CLI suggestion tool for Wikidata entities☆30Updated 8 years ago
- Guides and introductions for participating in Labs and some of its projects.☆170Updated 8 years ago
- Scan a folder of document files of all types and extract the text into a CSV suitable for Overview☆26Updated 9 years ago
- A d3 layout for creating XKCD style narrative charts☆190Updated 9 years ago
- Drop in crowdsourcing for your Rails app. Extracted from Free the Files.☆83Updated 10 years ago
- Politwoops web front end☆44Updated 7 years ago
- The Legal Resource Registry has moved!☆16Updated 5 years ago