ArchiveTeam / seesaw-kit
Making a reusable toolkit for writing seesaw scripts
☆70Updated last year
Alternatives and similar repositories for seesaw-kit:
Users that are interested in seesaw-kit are comparing it to the libraries listed below
- Boot scripts for the ArchiveTeam Warrior 2☆25Updated 6 years ago
- URLTeam's second generation of URL shortener archiving tools☆75Updated last month
- Nondestructive warc-in-tar to warc conversion☆26Updated 11 years ago
- The Seesaw pipeline grab script for the URLTeam (terroroftinytown) project☆27Updated 6 months ago
- Archiving Google+.☆25Updated 5 years ago
- wpull fork with fixes and faster parsing using html5-parser; used by grab-site; should go away when wpull is similarly improved☆27Updated 8 months ago
- We back up a lot of stuff from around the web; now it's time to back up the Internet Archive, just in case.☆88Updated 4 years ago
- ArchiveBot, an IRC bot for archiving websites☆377Updated 3 weeks ago
- Archiving all to-be-deleted NSFW tumblr blogs.☆50Updated 6 years ago
- An HTTP-based warc-to-zip converter☆11Updated 12 years ago
- Witches Town extended extended informations☆12Updated 6 years ago
- An evil web server.☆13Updated 9 years ago
- Saving all questions and answers from Yahoo! Answers.☆50Updated 3 years ago
- Archiving URLs (outlinks) from a variety of sources.☆20Updated 3 weeks ago
- A command line tool to archive a git repository from GitHub to the Internet Archive.☆93Updated 4 years ago
- Saves proxied HTTP traffic to a WARC file.☆27Updated 11 years ago
- Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.☆117Updated 2 months ago
- A python bot that logs IRC channels, and a PHP/JS interface for browsing said logs.☆52Updated 4 years ago
- Continuation of ShadowIRCD to meet people's needs☆41Updated 8 years ago
- Matrix bot tuned for Mozilla's needs☆25Updated 2 years ago
- 🗄 Bot powering the @LinkArchiver Twitter tool to send tweeted URLs to the Wayback Machine☆46Updated 7 years ago
- ☆65Updated 3 years ago
- distributed ignore lists for IRC☆7Updated 9 years ago
- Convert HTTP Archive (HAR) -> Web Archive (WARC) format☆51Updated 6 years ago
- Web archiving using Google Chrome☆44Updated 5 years ago
- Reduce annoying 404 pages by automatically checking for an archived copy in the Wayback Machine. Learn more about this Test Pilot experim…☆56Updated 6 years ago
- Grabbing all news.☆62Updated 5 years ago
- Links on the web break all the time, robustify them!☆53Updated 4 years ago
- Classifying all inanimate objects into those that cause cancer and those that prevent it, via the Daily Mail.☆25Updated 2 years ago
- Discord archiver☆60Updated last year