fritz-hh / OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
☆260Updated 9 years ago
Alternatives and similar repositories for OCRmyPDF:
Users that are interested in OCRmyPDF are comparing it to the libraries listed below
- A toolbox and web application for working with and presenting textual material from Shakespeare to Schopenhauer, and letters to literatur…☆149Updated 10 years ago
- Modular workflow assistant for book digitization☆125Updated 9 years ago
- ☆31Updated 2 years ago
- Docker container to provide Apache Tika RESTful API☆41Updated 9 years ago
- Check out https://github.com/webrecorder/webrecorder for newer version matching https://webrecorder.io☆38Updated 9 years ago
- Guides and introductions for participating in Labs and some of its projects.☆170Updated 8 years ago
- a NodeJS library for monitoring changes on Wikipedia sites☆70Updated 3 years ago
- ☆29Updated 8 years ago
- Re-usable wrapper scripts for text document extractors.☆37Updated 8 years ago
- Structured Data from PDF image-based files☆88Updated 12 years ago
- ☆17Updated 10 years ago
- A place to collect and share knowledge about liberating data from PDFs☆54Updated 3 years ago
- "Old SFM" -- manage rules and streams from social data sources, starting with twitter.☆86Updated last year
- Enhanced Social Tagging for Academic Communities☆95Updated 6 months ago
- Friendly Slack bot for looking up cases☆21Updated 7 years ago
- Scripts to create git repositories for ALTO XML texts, like those from the British Library's scanned documents.☆31Updated 7 years ago
- Adds text to PDF files using the cuneiform OCR software☆326Updated 4 years ago
- Turns legal citations in the DOM into links☆20Updated 8 years ago
- ☆36Updated 7 years ago
- A visualization of the Berlin public transport map with and without accessible stations.☆17Updated 10 years ago
- Breve☆28Updated 5 years ago
- official diybookscanner repository☆39Updated 10 years ago
- A platform for tools that do stuff with data☆56Updated 6 years ago
- Pathagar is a simple bookserver serving OPDS feeds☆101Updated 2 weeks ago
- neonion is a user-centered collaborative semantic annotation webapp developed at the Human-Centered Computing group at Freie Universität …☆68Updated 6 years ago
- Encryption for Journalists - Hacks/Hackers NYC☆40Updated 11 years ago
- Facilitating the global conversation on academic literature☆266Updated 7 years ago
- command line resource for working with digital primary sources☆27Updated 6 years ago
- Lens - open science content creation and display☆124Updated 8 years ago
- Moved to:☆58Updated 5 years ago