jfilter / pdf-scriptsLinks
π Scripts to repair, verify, OCR, compress, wrangle, crop (etc.) PDFs
β70Updated last year
Alternatives and similar repositories for pdf-scripts
Users that are interested in pdf-scripts are comparing it to the libraries listed below
Sorting:
- Tool to index and serve HTML files. Powered by Datasette.β107Updated 3 years ago
- ReadablePDF streamlines the effort of turning a not so great PDF into a more easily readable PDF (or of course a pretty decent PDF into aβ¦β33Updated 4 years ago
- Fast PDF generation and compression. Deals with millions of pages daily.β125Updated last month
- A post-processing tool for scanned sheets of paper.β84Updated last year
- A social media open post web archiving toolβ27Updated last week
- Export your personal Spotify data: playlists, saved tracks/albums/shows, etc. as JSONβ38Updated 2 months ago
- Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each pageβ¦β40Updated last year
- π¨ High-fidelity, browser-based, single-page web archiving library and CLI for witnessing the web.β176Updated last month
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.β50Updated 3 weeks ago
- Comparing warc filesβ17Updated 6 years ago
- A list of things related to software, literature, and other content for π£ Mementoβ99Updated last year
- π Dehyphenation of broken text (mainly German), i.e., extracted from a PDFβ39Updated 3 years ago
- Extract networks of entities from journalistic reportingβ48Updated 2 years ago
- backup and parse your browser history databases (chrome, firefox, safari, and other chrome/firefox derivatives)β146Updated 3 weeks ago
- Create Robust Links from within Zoteroβ20Updated 3 years ago
- A wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR β¦β66Updated last year
- Export/access your Hypothes.is data: annotations and profile infoβ45Updated 2 months ago
- Make graphs you can play with... Web app in Flask and Bootstrap to fetch Zotero datasets and then create graph visualizations with d3.jsβ22Updated 7 years ago
- β43Updated 2 years ago
- Search google scholar and only return the papers published on high h-index journalsβ17Updated 2 years ago
- π Python Package to reconstruct the original continuous text from PDFs with language modelsβ32Updated 2 years ago
- Bash script template // archived, please use https://github.com/pforret/bashewβ18Updated 5 years ago
- A simple Python wrapper and command-line interface for archive.orgβs "Save Page Now" capturing serviceβ184Updated 11 months ago
- Jurisdiction ID and abbreviation data files for using with Jurism and other projects.β38Updated last year
- Homebrew formula for the ArchiveBox self-hosted internet archiving solution.β28Updated last year
- Make your PDF files text-searchable (A GUI for OCRmyPDF)β49Updated last year
- π‘βοΈοΈ β¬οΈοΈ JSON to Markdown converter - Generate Markdown from format independent JSONβ76Updated 6 years ago
- Analyze forks network to find hidden gemsβ77Updated 4 years ago
- Command line tool for converting CSV files into Markdown tables.β132Updated 11 months ago
- My life dashboard - automatically track and visualize your data. Using common tracker APIs to create a minute by minute representation ofβ¦β19Updated 4 years ago