fritz-hh / OCRmyPDFLinks
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
☆261Updated 9 years ago
Alternatives and similar repositories for OCRmyPDF
Users that are interested in OCRmyPDF are comparing it to the libraries listed below
Sorting:
- A toolbox and web application for working with and presenting textual material from Shakespeare to Schopenhauer, and letters to literatur…☆149Updated 10 years ago
- ☆29Updated 9 years ago
- Enhanced Social Tagging for Academic Communities☆97Updated last month
- mirror a website, put it in a bag☆24Updated 3 years ago
- Create a git repository from the revision history of a document in Google Drive.☆134Updated 8 years ago
- Extract tables from PDF files☆359Updated 9 years ago
- Modular workflow assistant for book digitization☆131Updated 9 years ago
- Docker container to provide Apache Tika RESTful API☆41Updated 9 years ago
- a NodeJS library for monitoring changes on Wikipedia sites☆70Updated 4 years ago
- Scripts to create git repositories for ALTO XML texts, like those from the British Library's scanned documents.☆31Updated 8 years ago
- Facilitating the global conversation on academic literature☆267Updated 8 years ago
- An online annotation platform for teaching and learning in the humanities.☆108Updated last month
- a CLI suggestion tool for Wikidata entities☆30Updated 9 years ago
- A website for crowd-sourcing structured election candidate data☆59Updated 5 years ago
- scribe API☆81Updated 6 years ago
- "Old SFM" -- manage rules and streams from social data sources, starting with twitter.☆86Updated 2 years ago
- [DEPRECATED] Please use https://goodtables.io☆13Updated 9 years ago
- Politwoops web front end☆44Updated 8 years ago
- A tool for the geospatial analysis, literary network visualization, and plot mapping of ancient texts☆15Updated 7 years ago
- display urls being tweeted with an event hashtag☆18Updated 9 years ago
- MOVED TO https://gitlab.com/crossref/pdfmark☆34Updated 7 years ago
- Open source large document set visualization platform☆270Updated 2 years ago
- Python scripts for interacting with the hypothes.is API☆48Updated 8 years ago
- Scripto is an open source documentary transcription tool library written in PHP.☆34Updated 8 years ago
- A fast, responsive HTML5 viewer for scanned items, developed for the World Digital Library. A project of the Library of Congress. Note: p…☆22Updated 10 years ago
- “Let Me Get That Data For You” catalogs the machine-readable data on a given domain name. [RETIRED]☆102Updated 10 years ago
- Turns legal citations in the DOM into links☆20Updated 8 years ago
- A minimal Akoma Ntoso -based legal informatics toolchain☆15Updated 2 years ago
- Check out https://github.com/webrecorder/webrecorder for newer version matching https://webrecorder.io☆38Updated 10 years ago
- Envisioning the future of the Hypothesis.☆40Updated 7 years ago