jfilter / pdf-scriptsLinks
📑 Scripts to repair, verify, OCR, compress, wrangle, crop (etc.) PDFs
☆69Updated last year
Alternatives and similar repositories for pdf-scripts
Users that are interested in pdf-scripts are comparing it to the libraries listed below
Sorting:
- ReadablePDF streamlines the effort of turning a not so great PDF into a more easily readable PDF (or of course a pretty decent PDF into a…☆33Updated 3 years ago
- Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page…☆40Updated 10 months ago
- Generate a list of your GitHub stars by topic - automatically!☆78Updated 2 years ago
- Create one timeline from various digital sources☆12Updated 8 months ago
- backup and parse your browser history databases (chrome, firefox, safari, and other chrome/firefox derivatives)☆141Updated 7 months ago
- Presentations on Quantified Self and Self-Tracking with Python☆30Updated 2 years ago
- A social media open post web archiving tool☆27Updated last month
- Some tools to help analyze the twitter archive☆62Updated last month
- Jurisdiction ID and abbreviation data files for using with Jurism and other projects.☆37Updated last year
- Export your Github activity: events, repositories, stars, etc.☆52Updated last year
- A collection of curated home built packages for the cross-platform text expander Espanso☆42Updated 3 weeks ago
- Search google scholar and only return the papers published on high h-index journals☆17Updated 2 years ago
- Comparing warc files☆17Updated 6 years ago
- Export/access your Hypothes.is data: annotations and profile info☆44Updated this week
- A financial disclosure data extraction tool.☆16Updated last year
- GoodLinks Exporter☆10Updated last year
- Export userdata from your reddit accounts. Submissions, comments, saved, upvoted contents are supported.☆22Updated 8 months ago
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆43Updated 2 weeks ago
- A simple, open source, self-hosted todo manager.☆19Updated last year
- ☆20Updated 2 years ago
- Download a webpage as an e-book☆7Updated 2 weeks ago
- Python based Wikidata framework for easy dataframe extraction☆45Updated last year
- A wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR …☆65Updated last year
- Extract networks of entities from journalistic reporting☆48Updated 2 years ago
- A helper library full of URL-related heuristics.☆70Updated last month
- Markdown text to a novel in ePub and PDF.☆56Updated 3 years ago
- Export your personal Spotify data: playlists, saved tracks/albums/shows, etc. as JSON☆38Updated last year
- A collection of regular expressions for matching citations to state, federal, and even international law☆36Updated 4 years ago
- Airtable backup script package☆23Updated 3 years ago
- A repository of datasets for learning and mastering Gephi☆10Updated 7 months ago