ArchiveTeam / grab-siteLinks
The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
☆1,500Updated last month
Alternatives and similar repositories for grab-site
Users that are interested in grab-site are comparing it to the libraries listed below
Sorting:
- Wget-compatible web downloader and crawler.☆586Updated last year
- Collect and revisit web pages.☆1,505Updated 6 months ago
- Core Python Web Archiving Toolkit for replay and recording of web archives☆1,527Updated 2 months ago
- Webrecorder Player for Desktop (OSX/Windows/Linux). (Built with Electron + Webrecorder)☆446Updated 4 years ago
- Run a high-fidelity browser-based web archiving crawler in a single Docker container☆824Updated this week
- Web Archiving Integration Layer: One-Click User Instigated Preservation☆376Updated 3 months ago
- brozzler - distributed browser-based web crawler☆722Updated 2 weeks ago
- Serverless replay of web archives directly in the browser☆815Updated 3 weeks ago
- ArchiveBot, an IRC bot for archiving websites☆390Updated last month
- WARC writing MITM HTTP/S proxy☆415Updated this week
- Use yt-dlp to download video/metadata and upload to the Internet Archive.☆455Updated last month
- A Tool To Push Web Resources Into Web Archives☆421Updated last year
- A Python and Command-Line Interface to Archive.org☆1,732Updated this week
- Tools for downloading and preserving wikis. We archive wikis, from Wikipedia to tiniest wikis. As of 2025, WikiTeam has preserved more th…☆775Updated 3 months ago
- An archiving tool with an IM-style interface that prioritizes privacy and accessibility, integrated with various archival services includ…☆1,995Updated this week
- List of data-hoarding related tools☆1,185Updated last year
- Chrome extension to "Create WARC files from any webpage"☆222Updated last year
- Utilities for dealing with Tumblr blogs, Tumblr backup☆686Updated 5 months ago
- Self-Hosted Bookmark And Archive Manager☆1,818Updated last year
- Lightning-fast file system indexer and search tool☆1,095Updated this week
- Starting point for archiving entire YouTube channels using yt-dlp (originally youtube-dl)☆503Updated 2 years ago
- A curated list of awesome tools for website diffing and change monitoring.☆512Updated 2 years ago
- Download the entire Wayback Machine archive for a given URL.☆3,049Updated 2 months ago
- IA's public Wayback Machine (moved from SourceForge)☆794Updated last year
- A filesystem which allows you to mount HTTP directory listings or a single file, with a permanent cache. Now with Airsonic / Subsonic sup…☆818Updated 2 months ago
- Webrecorder Desktop App!☆205Updated 4 years ago
- CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)☆881Updated last month
- Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matc…☆808Updated 4 months ago
- archive reddit data as offline friendly web pages☆175Updated 4 years ago
- Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)☆163Updated this week