jfilter / pdf-scripts
📑 Scripts to repair, verify, OCR, compress, wrangle, crop (etc.) PDFs
☆63Updated 6 months ago
Related projects ⓘ
Alternatives and complementary repositories for pdf-scripts
- Auflistung Freier/Libre-Open-Source-Software, die bereits im öffentlichen Dienst genutzt oder gar selbst betrieben wird. Ergänzungen aus …☆31Updated last year
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆38Updated 2 years ago
- A Python scraping module, that extracts text from articles found in RSS feeds. Uses SQLite as database.☆19Updated 4 months ago
- Datasette plugin for uploading CSV files and converting them to database tables☆24Updated 7 months ago
- A social media open post web archiving tool☆25Updated last month
- Eine kuratierte Liste hilfreicher Informationen zu Offenen Daten☆19Updated 2 years ago
- etl pipeline, graphical explorer and general toolbox for investigations with follow the money data☆14Updated 10 months ago
- Chrome extension that uses Memento to indicate that a page a user is viewing on the live web has an archived copy and to give the user ac…☆49Updated last month
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆33Updated last week
- A helper library full of URL-related heuristics.☆64Updated last month
- New Collection Tool☆41Updated 4 years ago
- Extract networks of entities from journalistic reporting☆47Updated last year
- Scraper for German democracy documents☆32Updated last year
- 📚 Online archive for annual reports of the German internal intelligence☆11Updated last week
- scraper for facebook, gab, google and tiktok☆22Updated 4 months ago
- Adds a reconciliation API endpoint to Datasette, based on the Reconciliation Service API specification.☆23Updated 9 months ago
- OffeneRegister.de – Offene Daten für das Handelsregister☆30Updated 2 years ago
- Polizeiliche Todesschüsse ab 1976☆12Updated 2 weeks ago
- PubPeer Chrome browser extension☆9Updated 6 months ago
- Sammlung dysfunktionaler Muster im OpenData-Umfeld☆42Updated last year
- Tool to index and serve HTML files. Powered by Datasette.☆86Updated 2 years ago
- Export your Github activity: events, repositories, stars, etc.☆48Updated last year
- The root of the webcurator tool project, containing all modules needed to run a fully functional webcurator tool.☆5Updated this week
- Self tracking your browser history!☆20Updated 10 months ago
- 🍨 High-fidelity, browser-based, single-page web archiving library and CLI for witnessing the web.☆117Updated this week
- Homebrew formula for the ArchiveBox self-hosted internet archiving solution.☆26Updated last month
- Versammlungen in Berlin: Konservieren historischer Daten.☆16Updated this week
- ⚙️ Das Backend zu OffeneGesetze.de☆25Updated 10 months ago
- Transparenzreport☆11Updated 2 years ago