jfilter / pdf-scripts
📑 Scripts to repair, verify, OCR, compress, wrangle, crop (etc.) PDFs
☆66Updated 9 months ago
Alternatives and similar repositories for pdf-scripts:
Users that are interested in pdf-scripts are comparing it to the libraries listed below
- Auflistung Freier/Libre-Open-Source-Software, die bereits im öffentlichen Dienst genutzt oder gar selbst betrieben wird. Ergänzungen aus …☆31Updated last year
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆38Updated 2 years ago
- etl pipeline, graphical explorer and general toolbox for investigations with follow the money data☆15Updated last year
- Comparing warc files☆16Updated 5 years ago
- Convert ALTO XML to plain text + minimal metadata☆15Updated 4 months ago
- Easily display Zotero items on a webpage☆32Updated last year
- Awesome list dedicated to digital and data preservation tools, sources, services and so on.☆24Updated 2 years ago
- 📚 A compilation of research relevant to Data Together's efforts tackling the general problem of data resilience & interactivity☆96Updated 6 years ago
- ReadablePDF streamlines the effort of turning a not so great PDF into a more easily readable PDF (or of course a pretty decent PDF into a…☆33Updated 3 years ago
- A social media open post web archiving tool☆25Updated 2 months ago
- Typademic turns distraction freely written markdown files into beautiful PDFs.☆22Updated 2 years ago
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆39Updated last month
- Abbreviations for use with the Abbreviation Filter developed for use with Multilingual Zotero.☆17Updated last year
- Export your Github activity: events, repositories, stars, etc.☆51Updated last year
- A Python scraping module, that extracts text from articles found in RSS feeds. Uses SQLite as database.☆18Updated 7 months ago
- A helper library full of URL-related heuristics.☆64Updated 4 months ago
- Cross-platform library client to automate any OPAC and library catalog from your local device, e.g. for renewing of borrowed books or se…☆38Updated last month
- A browser extension providing Open Access bibliographical services☆14Updated 2 years ago
- 😎 A community-curated list of awesome lawtech software and learning resources for legal technology and design.☆24Updated 5 years ago
- Create Robust Links from within Zotero☆17Updated 2 years ago
- Deep Zoom Image Downloader☆19Updated this week
- A list of things related to software, literature, and other content for 🕣 Memento☆94Updated 8 months ago
- A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service☆174Updated 4 months ago
- Eine kuratierte Liste hilfreicher Informationen zu Offenen Daten☆19Updated 2 years ago
- A simple, open source, self-hosted todo manager.☆17Updated last year
- tesseractXplore a tesseract ease of use gui with full control☆22Updated 3 years ago
- Extract networks of entities from journalistic reporting☆48Updated last year
- Presentations on Quantified Self and Self-Tracking with Python☆29Updated 2 years ago
- Scraper for German democracy documents☆34Updated last year
- generate clean readable PDFs from web-articles☆30Updated last year