dpapathanasiou / CleanScrapeLinks
A no-nonsense web scraping tool which removes the crap and preserves the content in epub and pdf formats.
☆41Updated 9 years ago
Alternatives and similar repositories for CleanScrape
Users that are interested in CleanScrape are comparing it to the libraries listed below
Sorting:
- Update a local archive of your tweets.☆49Updated 12 years ago
- Check out https://github.com/webrecorder/webrecorder for newer version matching https://webrecorder.io☆38Updated 9 years ago
- Automatically tag pinboard bookmarks based on page text☆8Updated 9 years ago
- Google Chrome Extension. Record All Browsing in Screenshots & Full Text. Search For Anything At Any Time. Never Forget Where You Read Som…☆308Updated 7 years ago
- A library to parse Wayback Machine of archive.org to get a historical views of web pages. It is a useful tool to research on the evolutio…☆20Updated 6 years ago
- Chrome extension that uses Memento to indicate that a page a user is viewing on the live web has an archived copy and to give the user ac…☆54Updated 3 weeks ago
- Web archiving using Google Chrome☆46Updated 5 years ago
- Extract list of results from search engines pages as CSV with a bookmarklet directly within the browser☆24Updated 3 months ago
- File Filer; sort files into structured directory tree. Tree can be structured based on various designs such as date (file modification ti…☆48Updated 7 years ago
- Scripts for accessing and uploading to Flickr.☆35Updated 10 years ago
- A collection of small scripts to do various things☆31Updated 10 years ago
- Incorporates external dependencies into HTML file using data: URI scheme☆20Updated 13 years ago
- An adaptation of rss2mail that uses IMAP directly☆86Updated 3 years ago
- Recover lost websites from the Web Infrastructure☆89Updated 4 years ago
- Junk drawer of old scripts.☆18Updated 9 years ago
- Scripts for scraping the Heavens Above website and putting its information into iCal.☆17Updated 12 years ago
- One-Click User Instigated Preservation☆127Updated 6 years ago
- my personal task paper workflow☆7Updated 8 years ago
- Grab files matching a search specification from Github☆110Updated 4 years ago
- Automatically chooses new tags for articles based on existing tagged items☆27Updated 7 years ago
- Over-engineered tool for symlinking dotfiles☆35Updated 11 years ago
- Simple bookmarking service☆20Updated last week
- Scrapy python crawler/spider with post/get login (handles CSRF), variable level of recursions and optionally save to disk☆54Updated 6 years ago
- Save a bunch of web pages as a self-contained, compressed archive file for offline storage and sharing.☆35Updated 12 years ago
- A dockerized, queued high fidelity web archiver based on Squidwarc☆61Updated last year
- Python script that reads the iCloud tab database on macOS and pulls open tabs into an HTML Bookmark file.☆12Updated 6 years ago
- I'm Leselys, your very elegant RSS reader.☆226Updated 4 years ago
- PNotes: A no-server personal wiki.☆37Updated 10 years ago
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆47Updated 7 years ago
- Collection of Workflows for the iOS app Workflow (http://workflow.is)☆10Updated 9 years ago